[go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server因OOM异常退出 #397

Open
zurich37 opened this issue Apr 11, 2022 · 2 comments
Open

server因OOM异常退出 #397

zurich37 opened this issue Apr 11, 2022 · 2 comments
Assignees

Comments

@zurich37
Copy link

异常描述:
在生产环境,powerjob运行5天左右,开始出现如下WARN日志,这个时间点server所在机器没有其他任务执行:
2022-04-08 16:40:04.595 WARN 92328 --- [l-1 housekeeper] com.zaxxer.hikari.pool.HikariPool : HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=1m19s605ms453µs590ns). 60455 2022-04-08 16:40:04.595 WARN 92328 --- [l-2 housekeeper] com.zaxxer.hikari.pool.HikariPool : HikariPool-2 - Thread starvation or clock leap detected (housekeeper delta=1m19s605ms447µs612ns).

一小时左右会出现以下WARN:
`
2022-04-08 17:26:07.766 WARN 92328 --- [sTimingPool-128] t.p.s.c.scheduler.PowerScheduleService : [JobScheduleService] The database q uery is using too much time(163497ms), please check if the database load is too high!

2022-04-08 17:45:24.804 WARN 92328 --- [l-2 housekeeper] com.zaxxer.hikari.pool.HikariPool : HikariPool-2 - Thread starvation or clock leap detected (housekeeper delta=1m8s702ms237µs424ns).
60676 2022-04-08 17:45:24.804 WARN 92328 --- [l-1 housekeeper] com.zaxxer.hikari.pool.HikariPool : HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=1m8s702ms261µs859ns).
60677 2022-04-08 17:45:24.817 WARN 92328 --- [dispatcher-5494] t.p.s.remote.worker.ClusterStatusHolder : [ClusterStatusHolder-watsons-etl] r eceive the expired heartbeat from 10.82.xxx:27777, serverTime: 1649411124817, heartTime: 1649411061451
`

直至出现JDBCConnectionException
2022-04-08 17:52:36.930 ERROR 92328 --- [ XNIO-1 task-21] o.h.engine.jdbc.spi.SqlExceptionHelper : HikariPool-1 - Connection is not av ailable, request timed out after 92770ms. 60711 2022-04-08 17:52:36.965 ERROR 92328 --- [SchedulePool-74] o.s.s.s.TaskUtils$LoggingErrorHandler : Unexpected error occurred in schedu led task 60712 60713 org.springframework.transaction.CannotCreateTransactionException: Could not open JPA EntityManager for transaction; nested exception is org.hibernate.exception.JDBCConnectionException: Unable to acquire JDBC Connection 60714 at org.springframework.orm.jpa.JpaTransactionManager.doBegin(JpaTransactionManager.java:448) 60715 at org.springframework.transaction.support.AbstractPlatformTransactionManager.startTransaction(AbstractPlatformTransactionManage r.java:400) 60716 at org.springframework.transaction.support.AbstractPlatformTransactionManager.getTransaction(AbstractPlatformTransactionManager. java:373) 60717 at org.springframework.transaction.interceptor.TransactionAspectSupport.createTransactionIfNecessary(TransactionAspectSupport.ja va:574) va:574) 60718 at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:36 1) 60719 at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:118) 60720 at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) 60721 at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor. java:139) 60722 at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) 60723 at org.springframework.data.jpa.repository.support.CrudMethodMetadataPostProcessor$CrudMethodMetadataPopulatingMethodInterceptor .invoke(CrudMethodMetadataPostProcessor.java:149) 60724 at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) 60725 at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:95) 60726 at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) 60727 at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212) 60728 at com.sun.proxy.$Proxy149.updateGmtModifiedByIp(Unknown Source) 60729 at tech.powerjob.server.remote.server.ServerInfoService.heartbeat(ServerInfoService.java:101) 60730 at sun.reflect.GeneratedMethodAccessor234.invoke(Unknown Source) 60731 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 60732 at java.lang.reflect.Method.invoke(Method.java:498) 60733 at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:84) 60734 at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54) 60735 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 60736 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 60737 at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) 60738 at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) 60739 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 60740 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 60741 at java.lang.Thread.run(Thread.java:748) 60742 Caused by: org.hibernate.exception.JDBCConnectionException: Unable to acquire JDBC Connection 60743 at org.hibernate.exception.internal.SQLExceptionTypeDelegate.convert(SQLExceptionTypeDelegate.java:48) 60744 at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:42) 60745 at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:113) 60746 at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:99) 60747 at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.acquireConnectionIfNeeded(LogicalConnectionManagedImpl.java :111) 60748 at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.getPhysicalConnection(LogicalConnectionManagedImpl.java:138 ) 60749 at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.getConnectionForTransactionManagement(LogicalConnectionMana gedImpl.java:273) 60750 at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.begin(LogicalConnectionManagedImpl.java:281) 60751 at org.hibernate.resource.transaction.backend.jdbc.internal.JdbcResourceLocalTransactionCoordinatorImpl$TransactionDriverControl Impl.begin(JdbcResourceLocalTransactionCoordinatorImpl.java:246) 60752 at org.hibernate.engine.transaction.internal.TransactionImpl.begin(TransactionImpl.java:83) 60753 at org.springframework.orm.jpa.vendor.HibernateJpaDialect.beginTransaction(HibernateJpaDialect.java:184) 60754 at org.springframework.orm.jpa.JpaTransactionManager.doBegin(JpaTransactionManager.java:402) 60755 ... 27 common frames omitted 60756 Caused by: java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out after 92761ms. 60757 at com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:695) 60758 at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:197) 60759 at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:162) 60760 at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:128) 60761 at org.hibernate.engine.jdbc.connections.internal.DatasourceConnectionProviderImpl.getConnection(DatasourceConnectionProviderImp l.java:122) 60762 at org.hibernate.internal.NonContextualJdbcConnectionAccess.obtainConnection(NonContextualJdbcConnectionAccess.java:38) 60763 at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.acquireConnectionIfNeeded(LogicalConnectionManagedImpl.java :108) 60764 ... 34 common frames omitted

最终会OOM导致powerjob server崩溃:
`
java.lang.OutOfMemoryError: GC overhead limit exceeded

2022-04-10 03:52:44.887 INFO 92328 --- [sTimingPool-278] t.p.s.remote.worker.ClusterStatusHolder : [ClusterStatusHolder-watsons-etl] clean the containerInfos, listDeployedContainer service may down about 1min~
2022-04-10 04:14:10.592 INFO 92328 --- [XNIO-1 task-216] o.s.core.annotation.MergedAnnotation : Failed to introspect annotations on class tech.powerjob.server.web.controller.ServerController: java.lang.OutOfMemoryError: GC overhead limit exceeded
java.lang.OutOfMemoryError: GC overhead limit exceeded
oms-server-scheduler-7077]: GC overhead limit exceeded, shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[]: oms-server-scheduler-7079]: GC overhead limit exceeded, shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[oms-server]
]: GC overhead limit exceededshutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[oms-server]: GC overhead limit exceeded, shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[oms-server]
oms-server]
java.lang.OutOfMemoryError: GC overhead limit exceeded
]: GC overhead limit exceeded, shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[oms-server]
Uncaught error from thread [oms-server-akka.actor.internal-dispatcher-7057]: GC overhead limit exceeded, shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled forException in thread "PowerJobSchedulePool-4" java.lang.OutOfMemoryError: GC overhead limit exceeded
Exception in thread "PowerJobSchedulePool-29" java.lang.OutOfMemoryError: GC overhead limit exceeded
Exception in thread "PowerJobSchedulePool-45" Exception in thread "Thread-2"
Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "Thread-2"
Uncaught error from thread [Uncaught error from thread [oms-server-akka.actor.default-dispatcher-7053]: GC overhead limit exceeded, shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[oms-server]
java.lang.OutOfMemoryError: GC overhead limit exceeded
Uncaught error from thread [oms-server-scheduler-7090]: GC overhead limit exceeded, shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[oms-server]
oms-server-scheduler-7086]: GC overhead limit exceeded, shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[oms-server]
Exception in thread "PowerJobSchedulePool-97"
`
以上现象在服务部署后总崩溃了三次,帮忙看下什么原因吧,感谢!

@Echo009 Echo009 self-assigned this Apr 12, 2022
@Echo009
Copy link
Member
Echo009 commented Apr 12, 2022

从日志上看,可能存在慢 SQL 或者连接池配置的最大连接数量过小的问题,请检查一下你的连接池配置,以及 SQL 执行日志。
至于内存溢出的问题,大概率是你启动参数中配置的内存太小了,建议根据实际使用场景调大。

另外你这里提供的信息太少了,没法分析出具体的原因,请至少补充以下信息

  1. PowerJob 版本、部署环境(OS & JDK)
  2. 启动参数配置
  3. 数据库连接池配置、数据库实例相关信息
  4. 任务数量

@lwhocmy
Copy link
lwhocmy commented Jul 28, 2022

请问后来怎么解决了呀?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants