CN116303319A

CN116303319A - A log collection method, device and storage medium

Info

Publication number: CN116303319A
Application number: CN202310250143.7A
Authority: CN
Inventors: 杨洋; 吴雷; 许少年; 辛晨; 李忆蕾; 谭佐艳; 刘璐
Original assignee: China Telecom Cloud Technology Co Ltd
Current assignee: China Telecom Cloud Technology Co Ltd
Priority date: 2023-03-13
Filing date: 2023-03-13
Publication date: 2023-06-23

Abstract

The application discloses a log acquisition method, a log acquisition device and a storage medium, which are used for preventing log loss. The log acquisition method disclosed by the application comprises the following steps: storing the generated log in a file A; collecting the content in the file A in real time, and uploading the content to the cloud; cutting and rotating the logs in the file A, and cleaning out expired logs; setting a shared dictionary in a shared memory; and starting a timer task, wherein the timer task performs path switching operation according to the shared dictionary. The application also provides a log acquisition device and a storage medium.

Description

A log collection method, device and storage medium

技术领域technical field

本申请涉及计算技术领域，尤其涉及一种日志采集方法、装置和存储介质。The present application relates to the technical field of computing, and in particular to a log collection method, device and storage medium.

背景技术Background technique

随着互联网的高速发展，越来越多的网站面临高并发请求的挑战。在使用Web服务或动态网关时，日志的记录和分析将必不可少，而传统的日志记录和采集方案一般是直接将日志发往云端，当网络中断时将存在丢失日志的情况，而当将日志保存在本地然后采集时，当通过Agent采集开启日志轮转的日志时也将存在日志丢失的情况。With the rapid development of the Internet, more and more websites are facing the challenge of high concurrent requests. When using web services or dynamic gateways, log recording and analysis will be essential. Traditional log recording and collection solutions generally send logs directly to the cloud. When the network is interrupted, logs will be lost. When the logs are saved locally and then collected, when the logs with log rotation enabled are collected through the Agent, the logs will also be lost.

发明内容Contents of the invention

针对上述技术问题，本申请实施例提供了一种日志采集方法、装置及存储介质，用以日志采集。In view of the above technical problems, embodiments of the present application provide a log collection method, device, and storage medium for log collection.

第一方面，本申请实施例提供的一种日志采集方法，包括：In the first aspect, a log collection method provided in the embodiment of the present application includes:

将生成的日志保存在文件A中；Save the generated log in file A;

实时采集所述文件A中的内容，并将所述内容上传到云上；Collect the content in the file A in real time, and upload the content to the cloud;

对所述文件A中的日志进行切割和轮转，并清理过期的日志；Cut and rotate the logs in the file A, and clean up expired logs;

在共享内存中设置共享字典；Set up a shared dictionary in shared memory;

启动定时器任务，所述定时器任务根据所述共享字典进行路径切换操作。A timer task is started, and the timer task performs a path switching operation according to the shared dictionary.

本发明中，将日志文件A上传到云上，并对文件A进行切割和轮转，启动定时任务并根据共享字典判断是否进行路径切换操作，从而提高了日志记录的安全性，防止日志文件丢失。In the present invention, the log file A is uploaded to the cloud, and the file A is cut and rotated, the timing task is started, and whether to perform the path switching operation is judged according to the shared dictionary, thereby improving the security of the log record and preventing the loss of the log file.

优选的，所述实时采集所述文件A中的内容包括：Preferably, the real-time collection of the contents of the file A includes:

实时采集所述文件A中的内容时，设置预定长度的缓存，所述缓存用于当网络出现中断再恢复时进行日志重传操作。When the content in the file A is collected in real time, a cache with a predetermined length is set, and the cache is used for log retransmission when the network is interrupted and then restored.

启动定时器任务包括：Start timer tasks include:

给每个进程添加一个定时器任务。Add a timer task to each process.

优选的，所述定时器任务根据所述共享字典进行路径切换操作包括：Preferably, the path switching operation performed by the timer task according to the shared dictionary includes:

当所述定时器任务超时时，根据所述共享字典判断是否需要触发日志路径切换操作，若需要进行日志路径切换操作则在完成日志路径切换操作后将所述共享字典进行重置。When the timer task times out, it is judged according to the shared dictionary whether a log path switching operation needs to be triggered, and if a log path switching operation is required, the shared dictionary is reset after the log path switching operation is completed.

优选的，所述将所述共享字典进行重置包括：Preferably, said resetting said shared dictionary comprises:

通过新增加的HTTP API接口将所述共享字典进行重置。The shared dictionary is reset through the newly added HTTP API interface.

优选的，所述对所述文件A中的日志进行切割和轮转包括：Preferably, the cutting and rotation of the log in the file A includes:

当需要进行日志轮转时，所述每个进程调用所述HTTP API接口进行日志路径切换操作；When log rotation is required, each process calls the HTTP API interface to perform a log path switching operation;

所述每个进程休眠预定时间长度，休眠结束后将所述文件A重命名为日志文件B；Each process sleeps for a predetermined length of time, and renames the file A to a log file B after the dormancy ends;

所述每个进程将日志保存路径修改为临时文件C。Each process modifies the log storage path to a temporary file C.

当日志轮转完成时，还包括：When log rotation is complete, also include:

将所述临时文件C以MV模式转换为文件A，继续将日志保存在所述文件A中。Convert the temporary file C to file A in MV mode, and continue to save the log in the file A.

第二方面，本申请实施例还提供一种日志采集装置，包括：In the second aspect, the embodiment of the present application also provides a log collection device, including:

采集模块，被配置用于将生成的日志保存在文件A中；The acquisition module is configured to save the generated log in file A;

上传模块，被配置用于实时采集所述文件A中的内容，并将所述内容上传到云上；The upload module is configured to collect the content in the file A in real time, and upload the content to the cloud;

轮转模块，被配置用于对所述文件A中的日志进行切割和轮转，并清理过期的日志；The rotation module is configured to cut and rotate the logs in the file A, and clean up expired logs;

字典模块，被配置用于在共享内存中设置共享字典；A dictionary module configured to set up a shared dictionary in shared memory;

切换模块，被配置用于启动定时器任务，所述定时器任务根据所述共享字典进行路径切换操作。The switching module is configured to start a timer task, and the timer task performs a path switching operation according to the shared dictionary.

第三方面，本申请实施例还提供一种日志采集装置，包括：存储器、处理器和用户接口；In a third aspect, the embodiment of the present application further provides a log collection device, including: a memory, a processor, and a user interface;

所述存储器，用于存储计算机程序；The memory is used to store computer programs;

所述用户接口，用于与用户实现交互；The user interface is configured to interact with the user;

所述处理器，用于读取所述存储器中的计算机程序，所述处理器执行所述计算机程序时，实现本发明提供的日志采集方法。The processor is configured to read the computer program in the memory, and when the processor executes the computer program, the log collection method provided by the present invention is realized.

第四方面，本申请实施例还提供一种处理器可读存储介质，所述处理器可读存储介质存储有计算机程序，所述处理器执行所述计算机程序时实现本发明提供的日志采集方法。In a fourth aspect, the embodiment of the present application further provides a processor-readable storage medium, the processor-readable storage medium stores a computer program, and the processor implements the log collection method provided by the present invention when executing the computer program .

使用本发明的日志采集方法，日志轮转时采用重命名方式而没有采用复制截断模式，能确保日志都能记录到文件中。在日志轮转前将引擎日志切换到临时文件，在完成日志轮转后将临时文件重命名成原文件的方案，能确保采集到所有的日志，不会造成云上的日志丢失。Using the log collection method of the present invention, the renaming mode is adopted instead of the copy and truncation mode during log rotation, which can ensure that all logs can be recorded in files. The scheme of switching engine logs to temporary files before log rotation and renaming the temporary files to original files after log rotation can ensure that all logs are collected without loss of logs on the cloud.

附图说明Description of drawings

为了更清楚地说明本申请实施例中的技术方案，下面将对实施例描述中所需要使用的附图作简要介绍，显而易见地，下面描述中的附图仅是本申请的一些实施例，对于本领域的普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings that need to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are only some embodiments of the present application. For Those of ordinary skill in the art can also obtain other drawings based on these drawings without making creative efforts.

图1为申请实施例提供日志采集方法流程示意图；Fig. 1 provides a schematic flow chart of the log collection method in the application embodiment;

图2为本申请实施例提供的日志采集示例一；Fig. 2 is a log collection example 1 provided by the embodiment of the present application;

图3为本申请实施例提供的日志采集示例二；FIG. 3 is a log collection example 2 provided by the embodiment of the present application;

图4为本申请实施例提供的日志采集示例三；Fig. 4 is the third example of log collection provided by the embodiment of the present application;

图5为本申请实施例提供的一种日志采集装置示意图；FIG. 5 is a schematic diagram of a log collection device provided in an embodiment of the present application;

图6为本申请实施例提供的另一种日志采集装置结构示意图。FIG. 6 is a schematic structural diagram of another log collection device provided in an embodiment of the present application.

具体实施方式Detailed ways

为了使本发明的目的、技术方案和优点更加清楚，下面将结合附图对本发明作进一步地详细描述，显然，所描述的实施例仅仅是本发明一部份实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例，都属于本发明保护的范围。In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, rather than all embodiments . Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

下面对文中出现的一些词语进行解释：The following is an explanation of some words that appear in the text:

1、本发明实施例中术语“和/或”，描述关联对象的关联关系，表示可以存在三种关系，例如，A和/或B，可以表示：单独存在A，同时存在A和B，单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。1. The term "and/or" in the embodiment of the present invention describes the association relationship of associated objects, indicating that there may be three relationships, for example, A and/or B, which may mean: A exists alone, A and B exist simultaneously, and There are three cases of B. The character "/" generally indicates that the contextual objects are an "or" relationship.

2、本申请实施例中术语“多个”是指两个或两个以上，其它量词与之类似。2. The term "multiple" in the embodiments of this application refers to two or more, and other quantifiers are similar.

下面将结合本申请实施例中的附图，对本申请实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本申请一部分实施例，并不是全部的实施例。基于本申请中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only some of the embodiments of the present application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

需要说明的是，本申请实施例的展示顺序仅代表实施例的先后顺序，并不代表实施例所提供的技术方案的优劣。It should be noted that the display order of the embodiments of the present application only represents the sequence of the embodiments, and does not represent the advantages or disadvantages of the technical solutions provided by the embodiments.

参见图1，本申请实施例提供的一种日志采集方法示意图，如图1所示，该方法包括步骤S101到S103：Referring to Fig. 1, a schematic diagram of a log collection method provided in the embodiment of the present application, as shown in Fig. 1, the method includes steps S101 to S103:

S101、将生成的日志保存在文件A中；S101. Save the generated log in file A;

S102、实时采集所述文件A中的内容，并将所述内容上传到云上；S102. Collect the content in the file A in real time, and upload the content to the cloud;

S103、对所述文件A中的日志进行切割和轮转，并清理过期的日志；S103. Cut and rotate the logs in the file A, and clean up expired logs;

S104、在共享内存中设置共享字典；S104, setting a shared dictionary in the shared memory;

S105、启动定时器任务，所述定时器任务根据所述共享字典进行路径切换操作。S105. Start a timer task, where the timer task performs a path switching operation according to the shared dictionary.

作为一种优选示例，S101中，将生成的日志保存在文件A中，可以包括：实时采集所述文件A中的内容时，设置预定长度的缓存，所述缓存用于当网络出现中断再恢复时进行日志重传操作。As a preferred example, in S101, saving the generated log in file A may include: when collecting the content in file A in real time, setting a cache of a predetermined length, the cache is used for recovery when the network is interrupted When the log retransmission operation is performed.

本发明中，文件A是指保存日志的指定的文件，用于区分本发明中后续将会用到的其他文件，表明文件A与其他文件不是同一个文件。In the present invention, file A refers to the specified file for saving the log, which is used to distinguish other files that will be used later in the present invention, indicating that file A is not the same file as other files.

其中，文件A在日志采集开始之前或者日志文件将要被记录到文件中前确定，确定的内容包括文件名称，存储路径等。还可以包括用于将生成的日志保存到文件A中的模块或者进程，即由哪个模块或者哪个进程将生成的日志保存到文件A中。Wherein, the file A is determined before the log collection starts or before the log file is to be recorded in the file, and the determined content includes the file name, the storage path, and the like. It may also include a module or process for saving the generated log into the file A, that is, which module or which process saves the generated log into the file A.

作为一种优选示例，S102中实时采集所述文件A中的内容，并将所述内容上传到云上，可以包括：As a preferred example, in S102, collecting the content in the file A in real time and uploading the content to the cloud may include:

使用日志采集代理实时监控和采集日志文件A，并将采集到的日志上传至云端，云端将收到的日志保存便于后续检索和分析。作为一种优选示例，代理增加缓存用于当网络中断再恢复时代理能进行日志重传操作确保云上的日志不丢失和完整。Use the log collection agent to monitor and collect log file A in real time, and upload the collected logs to the cloud, and the cloud will save the received logs for subsequent retrieval and analysis. As a preferred example, the proxy increases the cache to enable the proxy to perform log retransmission operations when the network is interrupted and restored to ensure that the logs on the cloud are not lost and complete.

其中，云端是指云服务器或者云存储器；Wherein, the cloud refers to a cloud server or a cloud storage;

作为一种优选示例，日志采集代理可以是一个单独的进程，也可以是一个单独的模块。代理增加的缓存大小，可以根据需要预先确定。当网络中断时，使用缓存中的日志重新传输到云上，保证日志不会丢失，保证日志的完整性。其中缓存的越大，能够承受的网络中断时间越长，但是同时耗费的资源也越多。因此，缓存的大小需要根据资源的情况以及安全性的要求综合确定。As a preferred example, the log collection agent may be a separate process or a separate module. The cache size increased by the proxy can be predetermined as needed. When the network is interrupted, the logs in the cache are retransmitted to the cloud to ensure that the logs will not be lost and the integrity of the logs is guaranteed. The larger the cache, the longer the network interruption can be tolerated, but the more resources are consumed at the same time. Therefore, the size of the cache needs to be comprehensively determined according to resource conditions and security requirements.

作为一种优选示例，S103中对所述文件A中的日志进行切割和轮转，并清理过期的日志包括：As a preferred example, cutting and rotating the logs in the file A in S103, and clearing expired logs includes:

对产生的日志文件A进行轮转，并设置定时器调用一个脚本，在脚本中检测日志文件A是否需要轮转，并清理过期的日志文件，避免写日志的速度和性能下降以及防止日志填满整个磁盘。其中，是否过期的判断，根据需要预先设定，作为一个优选示例，超过7天的日志判断为过期。Rotate the generated log file A, and set a timer to call a script, detect whether the log file A needs to be rotated in the script, and clean up the expired log file, avoiding the speed and performance of writing the log and preventing the log from filling up the entire disk . Wherein, the determination of whether to expire is preset according to needs. As a preferred example, logs exceeding 7 days are determined to be expired.

作为一种优选示例，S104中在共享内存中设置共享字典包括：As a preferred example, setting the shared dictionary in the shared memory in S104 includes:

在共享内存中申请一个共享字典，以便后续每个进程判断是否需要进行日志路径切换操作，并在初始化时初始化这个共享字典，为每个进程(worker)添加一个字典，进程关键字(key)为进程编号(worker_id)，值为false。作为一种优选示例，值为false表示不需要进行路劲切换操作，值为true表示愮进行路径切换操作。Apply for a shared dictionary in the shared memory, so that each subsequent process can judge whether it needs to switch the log path, and initialize the shared dictionary during initialization, add a dictionary for each process (worker), and the process key (key) is Process ID (worker_id), the value is false. As a preferred example, a value of false indicates that the path switching operation is not required, and a value of true indicates that the path switching operation is not performed.

作为一种优选示例，所述启动定时器任务包括：As a preferred example, the task of starting the timer includes:

给每个进程添加一个定时器任务。Add a timer task to each process.

作为一种优选示例，S105中启动定时器任务，所述定时器任务根据所述共享字典进行路径切换操作可以包括：As a preferred example, starting a timer task in S105, and performing a path switching operation according to the shared dictionary by the timer task may include:

作为一种优选示例，在初始化阶段给每个进程添加一个定时任务，定时器任务超时时去步骤S104中申请的共享字典中取以当前进程编号为关键字key对应的值来判断是否需要触发日志路径切换操作，若为true则将日志的保存路径切换到文件C上并将共享字典中对应的key重置为false。As a preferred example, add a timed task to each process during the initialization phase. When the timer task times out, go to the shared dictionary applied for in step S104 to get the value corresponding to the current process number as the key key to determine whether to trigger the log Path switch operation, if it is true, switch the save path of the log to file C and reset the corresponding key in the shared dictionary to false.

所述将所述共享字典进行重置包括：Said resetting said shared dictionary includes:

具体的，当检测到需要进行日志轮转时，将调用HTTP API接口，让各个进程进行日志路径切换操作，然后休眠等待预定的时间长度确保各个进程都触发日志路径切换操作，如果接口返回不是成功则继续请求HTTP API接口，重试机制最多5次，然后才进行日志轮转操作，即将日志文件A重命名为轮转后的日志文件B，这个时候日志采集代理将会丢失采集目标，即文件A。Specifically, when it is detected that log rotation is required, the HTTP API interface will be called to allow each process to perform log path switching operations, and then sleep and wait for a predetermined length of time to ensure that each process triggers log path switching operations. If the interface returns not successful then Continue to request the HTTP API interface, retry the mechanism up to 5 times, and then perform the log rotation operation, that is, rename the log file A to the rotated log file B. At this time, the log collection agent will lose the collection target, that is, file A.

作为一种优选示例，所述对所述文件A中的日志进行切割和轮转包括：As a preferred example, the cutting and rotating the log in the file A includes:

下面结合具体例子对上述S101到S105的方法进行说明。本例子中，日志采集代理为Agent，日志采集的平台为OpenResty，云端将收到的日志存入Elasticsearch中：The above methods from S101 to S105 will be described below with reference to specific examples. In this example, the log collection agent is Agent, the log collection platform is OpenResty, and the cloud stores the received logs in Elasticsearch:

步骤1：在OpenResty中新增一个Lua模块来将生成的日志保存到文件中，支持指定日志文件的保存路径，日志文件为文件A；Step 1: Add a Lua module in OpenResty to save the generated log to a file, support specifying the save path of the log file, and the log file is file A;

步骤2：使用日志采集Agent实时监控和采集日志文件A，并将采集到的日志上传至云端，云端将收到的日志存入Elasticsearch中便于后续检索和分析，Agent增加缓存设计用于当网络中断再恢复时Agent能进行日志重传操作确保云上的日志不丢失；Step 2: Use the log collection Agent to monitor and collect log file A in real time, and upload the collected logs to the cloud, and the cloud will store the received logs in Elasticsearch for subsequent retrieval and analysis. The Agent increases the cache design for when the network is interrupted When recovering, the Agent can perform log retransmission operations to ensure that the logs on the cloud are not lost;

步骤3：使用logrotate的mv+nocreate模式来对OpenResty产生的日志文件A进行轮转，并设置crontab定时调用一个bash脚本，在脚本中调用logrotate来检测日志文件A是否需要轮转，并清理7天以上的日志文件，避免写日志的速度和性能下降以及防止日志填满整个磁盘；Step 3: Use the mv+nocreate mode of logrotate to rotate the log file A generated by OpenResty, and set crontab to call a bash script regularly, call logrotate in the script to detect whether the log file A needs to be rotated, and clean up the log file A that is more than 7 days old Log files, to avoid slowing down the speed and performance of writing logs and preventing logs from filling up the entire disk;

步骤4：在OpenResty的共享内存中申请一个共享字典log_path_switch，以便后续每个worker判断是否需要进行日志路径切换操作，并在OpenResty的init_by_lua阶段初始化这个共享字典，为每个worker添加一个字典dict，key为worker_id，value为false；Step 4: Apply for a shared dictionary log_path_switch in the shared memory of OpenResty, so that each subsequent worker can determine whether a log path switching operation is required, and initialize the shared dictionary in the init_by_lua stage of OpenResty, and add a dictionary dict, key for each worker is worker_id, value is false;

步骤5：在OpenResty的init_worker_by_lua阶段给每个worker添加一个定时任务，每隔0.1秒去步骤4中申请的共享字典中取以当前worker_id为key对应的value来判断是否需要触发日志路径切换操作，若为true则将日志的保存路径切换到文件C上并将共享字典中对应的key重置为false；Step 5: Add a scheduled task to each worker in the init_worker_by_lua stage of OpenResty, go to the shared dictionary applied in step 4 every 0.1 seconds to get the value corresponding to the current worker_id as the key to determine whether to trigger the log path switch operation, if If true, switch the save path of the log to file C and reset the corresponding key in the shared dictionary to false;

步骤6：在OpenResty中新增HTTP API接口，调用后将修改步骤4中申请到的共享字典中每个key的value为true；Step 6: Add an HTTP API interface in OpenResty, and after calling, modify the value of each key in the shared dictionary applied in step 4 to true;

步骤7：当logrotate检测到需要进行日志轮转时，将在prerotate阶段调用步骤6中的HTTP API接口，让各个worker进行日志路径切换操作，然后sleep0.5秒确保各个worker都触发日志路径切换操作，如果接口返回不是成功则继续请求HTTP API接口，重试机制最多5次，然后才进行日志轮转操作，即将日志文件A mv成轮转后的日志文件B，这个时候日志采集Agent将会丢失采集目标，即文件A；Step 7: When logrotate detects that log rotation is required, it will call the HTTP API interface in step 6 in the prerotate stage to let each worker perform log path switching operations, and then sleep for 0.5 seconds to ensure that each worker triggers log path switching operations. If the interface return is not successful, continue to request the HTTP API interface, retry the mechanism up to 5 times, and then perform the log rotation operation, that is, the log file A mv will be converted into the rotated log file B. At this time, the log collection agent will lose the collection target. i.e. file A;

步骤8：当每个worker的定时任务检测到需要进行日志路径切换操作时，便将日志保存的路径切换到临时文件C，后续产生的日志将保存在文件C中；Step 8: When the scheduled task of each worker detects the need to switch the log path, it will switch the log saving path to the temporary file C, and the subsequent logs will be saved in the file C;

步骤9：当logrotate检测到日志轮转完成时，即在postrotate阶段将临时文件Cmv成原来的日志文件A，这个时候日志采集Agent检测到采集目标出现将会继续进行采集操作，而根据Linux文件系统的特性，Openresty的新日志将会继续保存在mv后的日志文件A中。Step 9: When logrotate detects that the log rotation is complete, the temporary file Cmv is converted into the original log file A in the postrotate stage. At this time, the log collection Agent detects that the collection target appears and will continue to collect operations. According to the Linux file system feature, Openresty's new log will continue to be saved in log file A after mv.

上述仅仅是一个示例，为了更好的结合上述示例说明本发明方案，并结合使用场景，对本发明方法的使用进行部分示例性举例说明。The above is only an example. In order to better illustrate the solution of the present invention in combination with the above examples, and in combination with the usage scenarios, some examples are given to illustrate the use of the method of the present invention.

如图2所示，客户端向OpenResty发起请求，OpenResty生成日志数据并保存在文件A中。同时，日志采集Agent采集日志文件中的内容并上传到云端。As shown in Figure 2, the client initiates a request to OpenResty, and OpenResty generates log data and saves it in file A. At the same time, the log collection agent collects the content in the log file and uploads it to the cloud.

如图3所示，1.logrotate在日志轮转前调用OpenResty的HTTP API进行日志切换操作；2.OpenResty收到HTTP API请求后将日志保存路径从文件A修改为文件C；3.客户端发起访问请求；4.OpenResty将新生成的日志保存到日志文件C中，这个时候日志采集Agent由于目标文件A不存在将停止采集；5.Logrotate进行日志轮转操作，将日志文件A重命名为日志文件B。As shown in Figure 3, 1. logrotate calls OpenResty’s HTTP API to perform log switching operations before log rotation; 2. OpenResty changes the log storage path from file A to file C after receiving the HTTP API request; 3. The client initiates access Request; 4. OpenResty saves the newly generated log to log file C. At this time, the log collection agent will stop collecting because the target file A does not exist; 5. Logrotate performs log rotation operation and renames log file A to log file B .

如图4所示，1.Logrotate在日志轮转成功后调用MV命令将临时文件C重命名为日志文件A；2.客户端发起访问请求；3.由于Linux文件系统的特性，OpenResty会将生成的日志写入重命名后的文件A中；4.Agent检测到日志文件A存在，将进行日志采集和上传到云的操作。As shown in Figure 4, 1. Logrotate calls the MV command to rename the temporary file C to log file A after the log rotation is successful; 2. The client initiates an access request; 3. Due to the characteristics of the Linux file system, OpenResty will generate The log is written into the renamed file A; 4. Agent detects that the log file A exists, and will collect and upload the log to the cloud.

使用本发明的日志采集方法，日志轮转时采用重命名方式而没有采用复制截断模式，能确保日志都能记录到文件中。在日志轮转前将引擎日志切换到临时文件，在完成日志轮转后将临时文件重命名成原文件的方案，能确保采集到所有的日志，不会造成云上的日志丢失。以上述示例为例，本发明可获得技术效果为：1)在logrotate日志轮转时采用mv模式而没有采用copytruncate模式，确保OpenResty的日志都能记录到文件中，因为copytruncate这种轮转模式在copy之后truncate之前的这个时间片中产生的新日志会被清空而丢失掉。2)为了兼容Agent采集日志，在logrotate日志轮转前将引擎日志切换到临时文件，在完成日志轮转后将临时文件mv成原文件的方案，这样能确保Agent采集到所有的日志，不会造成云上的日志丢失。3)Agent增加缓存设计用于当网络中断再恢复时Agent能将网络中断期间产生的日志进行重传确保云上的日志不丢失。Using the log collection method of the present invention, the renaming mode is adopted instead of the copy and truncation mode during log rotation, which can ensure that all logs can be recorded in files. The scheme of switching engine logs to temporary files before log rotation and renaming the temporary files to original files after log rotation can ensure that all logs are collected without loss of logs on the cloud. Taking the above example as an example, the technical effects obtained by the present invention are as follows: 1) The mv mode is used instead of the copytruncate mode during logrotate log rotation, so as to ensure that all OpenResty logs can be recorded in the file, because the rotation mode of copytruncate is after the copy The new logs generated in this time slice before truncate will be cleared and lost. 2) In order to be compatible with the log collection by the Agent, switch the engine log to a temporary file before the logrotate log rotation, and mv the temporary file into the original file after the log rotation is completed, so as to ensure that the Agent collects all the logs without causing cloud Logs on are missing. 3) The Agent increases the cache design so that when the network is interrupted and then restored, the Agent can retransmit the logs generated during the network interruption to ensure that the logs on the cloud are not lost.

基于同一个发明构思，本发明实施例还提供了一种日志采集装置，如图4所示，该装置包括：Based on the same inventive concept, the embodiment of the present invention also provides a log collection device, as shown in Figure 4, the device includes:

采集模块401，被配置用于将生成的日志保存在文件A中；The acquisition module 401 is configured to save the generated log in file A;

上传模块402，被配置用于实时采集所述文件A中的内容，并将所述内容上传到云上；The upload module 402 is configured to collect the content in the file A in real time, and upload the content to the cloud;

轮转模块404，被配置用于对所述文件A中的日志进行切割和轮转，并清理过期的日志；The rotation module 404 is configured to cut and rotate the logs in the file A, and clean up expired logs;

字典模块403，被配置用于在共享内存中设置共享字典；The dictionary module 403 is configured to set a shared dictionary in the shared memory;

切换模块405，被配置用于启动定时器任务，所述定时器任务根据所述共享字典进行路径切换操作。The switching module 405 is configured to start a timer task, and the timer task performs a path switching operation according to the shared dictionary.

作为一种优选示例，采集模块401还被被配置用于实时采集所述文件A中的内容时，设置预定长度的缓存，所述缓存用于当网络出现中断再恢复时进行日志重传操作。As a preferred example, the collection module 401 is also configured to set a buffer of a predetermined length when collecting the contents of the file A in real time, and the buffer is used for log retransmission when the network is interrupted and then restored.

作为一种优选示例，切换模块405还被配置用于给每个进程添加一个定时器任务；As a preferred example, the switching module 405 is also configured to add a timer task to each process;

作为一种优选示例，切换模块405还被配置用于当所述定时器任务超时时，根据所述共享字典判断是否需要触发日志路径切换操作，若需要进行日志路径切换操作则在完成日志路径切换操作后将所述共享字典进行重置。As a preferred example, the switching module 405 is also configured to judge whether a log path switching operation needs to be triggered according to the shared dictionary when the timer task times out, and if a log path switching operation is required, the log path switching operation is completed After the operation, the shared dictionary is reset.

所述对所述文件A中的日志进行切割和轮转包括：The cutting and rotation of the logs in the file A includes:

作为一种优选示例，切换模块405还被配置用于当日志轮转完成时将所述临时文件C以MV模式转换为文件A，继续将日志保存在所述文件A中。As a preferred example, the switching module 405 is further configured to convert the temporary file C into file A in MV mode when the log rotation is completed, and continue to save the log in the file A.

需要说明的是，图4所述得装置与上述方法实施例属于同一个发明构思，解决相同的技术问题，达到相同的技术效果，装置能实现所有方法，相同之处不再赘述。It should be noted that the device described in Figure 4 belongs to the same inventive concept as the above-mentioned method embodiment, solves the same technical problem, achieves the same technical effect, and the device can realize all the methods, and the similarities will not be repeated.

基于同一个发明构思，本发明实施例还提供了一种日志采集装置，如图5所示，该装置包括：Based on the same inventive concept, the embodiment of the present invention also provides a log collection device, as shown in Figure 5, the device includes:

包括存储器502、处理器501和用户接口503；Including memory 502, processor 501 and user interface 503;

所述存储器502，用于存储计算机程序；The memory 502 is used to store computer programs;

所述用户接口503，用于与用户实现交互；The user interface 503 is configured to interact with the user;

所述处理器501，用于读取所述存储器502中的计算机程序，所述处理器501执行所述计算机程序时，实现：The processor 501 is configured to read the computer program in the memory 502, and when the processor 501 executes the computer program, it realizes:

将生成的日志保存在文件A中；Save the generated log in file A;

其中，在图5中，总线架构可以包括任意数量的互联的总线和桥，具体由处理器501代表的一个或多个处理器和存储器502代表的存储器的各种电路链接在一起。总线架构还可以将诸如外围设备、稳压器和功率管理电路等之类的各种其他电路链接在一起，这些都是本领域所公知的，因此，本文不再对其进行进一步描述。总线接口提供接口。处理器501负责管理总线架构和通常的处理，存储器502可以存储处理器501在执行操作时所使用的数据。Wherein, in FIG. 5 , the bus architecture may include any number of interconnected buses and bridges, specifically one or more processors represented by the processor 501 and various circuits of the memory represented by the memory 502 are linked together. The bus architecture can also link together various other circuits such as peripherals, voltage regulators, and power management circuits, etc., which are well known in the art and therefore will not be further described herein. The bus interface provides the interface. The processor 501 is responsible for managing the bus architecture and general processing, and the memory 502 can store data used by the processor 501 when performing operations.

处理器501可以是CPU、ASIC、FPGA或CPLD，处理器501也可以采用多核架构。The processor 501 may be a CPU, ASIC, FPGA or CPLD, and the processor 501 may also adopt a multi-core architecture.

处理器501执行存储器502存储的计算机程序时，实现本发明中任一日志采集方法。When the processor 501 executes the computer program stored in the memory 502, any log collection method in the present invention is implemented.

需要说明的是，图5所示得装置与方法属于同一个发明构思，解决相同的技术问题，达到相同的技术效果，装置能实现所有方法，相同之处不再赘述。It should be noted that the device and method shown in Figure 5 belong to the same inventive concept, solve the same technical problem, achieve the same technical effect, and the device can realize all the methods, and the similarities will not be repeated.

本申请还提出一种处理器可读存储介质。其中，该处理器可读存储介质存储有计算机程序，所述处理器执行所述计算机程序时实现实施例一中的任一日志采集方法。The present application also proposes a processor-readable storage medium. Wherein, the processor-readable storage medium stores a computer program, and when the processor executes the computer program, any log collection method in Embodiment 1 is implemented.

需要说明的是，本申请实施例中对单元的划分是示意性的，仅仅为一种逻辑功能划分，实际实现时可以有另外的划分方式。另外，在本申请各个实施例中的各功能单元可以集成在一个处理单元中，也可以是各个单元单独物理存在，也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现，也可以采用软件功能单元的形式实现。It should be noted that the division of units in the embodiment of the present application is schematic, and is only a logical function division, and there may be another division manner in actual implementation. In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.

本领域内的技术人员应明白，本申请的实施例可提供为方法、系统、或计算机程序产品。因此，本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且，本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器和光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, optical storage, etc.) having computer-usable program code embodied therein.

本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器，使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to flowcharts and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the present application. It should be understood that each procedure and/or block in the flowchart and/or block diagram, and a combination of procedures and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions may be provided to a general purpose computer, special purpose computer, embedded processor, or processor of other programmable data processing equipment to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing equipment produce a An apparatus for realizing the functions specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中，使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品，该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions The device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

显然，本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的精神和范围。这样，倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内，则本申请也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the application without departing from the spirit and scope of the application. In this way, if these modifications and variations of the present application fall within the scope of the claims of the present application and their equivalent technologies, the present application is also intended to include these modifications and variations.

Claims

1. A log collection method, characterized in that, comprising:

Save the generated log in file A;

Collect the content in the file A in real time, and upload the content to the cloud;

Cut and rotate the logs in the file A, and clean up expired logs;

Set up a shared dictionary in shared memory;

A timer task is started, and the timer task performs a path switching operation according to the shared dictionary.

2. The method according to claim 1, wherein said collecting the contents of said file A in real time comprises:

When the content in the file A is collected in real time, a cache with a predetermined length is set, and the cache is used for log retransmission when the network is interrupted and then restored.

3. The method according to claim 2, wherein the starting timer task comprises:

Add a timer task to each process.

4. The method according to claim 3, wherein said timer task performing a path switching operation according to said shared dictionary comprises:

When the timer task times out, it is judged according to the shared dictionary whether a log path switching operation needs to be triggered, and if a log path switching operation is required, the shared dictionary is reset after the log path switching operation is completed.

5. The method according to claim 4, wherein said resetting said shared dictionary comprises:

The shared dictionary is reset through the newly added HTTP API interface.

6. The method according to claim 5, wherein the cutting and rotating the log in the file A comprises:

When log rotation is required, each process calls the HTTP API interface to perform a log path switching operation;

Each process sleeps for a predetermined length of time, and renames the file A to a log file B after the dormancy ends;

Each process modifies the log storage path to a temporary file C.

7. The method according to claim 6, further comprising: when the log rotation is completed:

Convert the temporary file C to file A in MV mode, and continue to save the log in the file A.

8. A log collection device, characterized in that, comprising:

The acquisition module is configured to save the generated log in file A;

The upload module is configured to collect the content in the file A in real time, and upload the content to the cloud;

The rotation module is configured to cut and rotate the logs in the file A, and clean up expired logs;

A dictionary module configured to set up a shared dictionary in shared memory;

The switching module is configured to start a timer task, and the timer task performs a path switching operation according to the shared dictionary.

9. A log collection device, comprising a memory, a processor and a user interface;

The memory is used to store computer programs;

The user interface is configured to interact with the user;

The processor is configured to read the computer program in the memory, and when the processor executes the computer program, the log collection method according to any one of claims 1 to 7 is realized.

10. A processor-readable storage medium, wherein the processor-readable storage medium stores a computer program, and when the processor executes the computer program, the method according to any one of claims 1 to 7 is realized. Log collection method.