EP3001320A1 - System and method for optimally managing heterogeneous data in a distributed storage environment - Google Patents
System and method for optimally managing heterogeneous data in a distributed storage environment Download PDFInfo
- Publication number
- EP3001320A1 EP3001320A1 EP15177580.6A EP15177580A EP3001320A1 EP 3001320 A1 EP3001320 A1 EP 3001320A1 EP 15177580 A EP15177580 A EP 15177580A EP 3001320 A1 EP3001320 A1 EP 3001320A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- data
- storage pool
- storage
- memory
- metadata
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000003860 storage Methods 0.000 title claims abstract description 156
- 238000000034 method Methods 0.000 title claims abstract description 27
- 230000002085 persistent effect Effects 0.000 claims description 16
- 230000007717 exclusion Effects 0.000 claims description 4
- 230000014759 maintenance of location Effects 0.000 claims description 3
- 230000004044 response Effects 0.000 claims 2
- 230000000977 initiatory effect Effects 0.000 claims 1
- 238000013507 mapping Methods 0.000 description 15
- 238000004891 communication Methods 0.000 description 11
- 230000007246 mechanism Effects 0.000 description 7
- 238000012545 processing Methods 0.000 description 7
- 239000008186 active pharmaceutical agent Substances 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 3
- 230000001413 cellular effect Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000007726 management method Methods 0.000 description 3
- 230000002776 aggregation Effects 0.000 description 2
- 238000004220 aggregation Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 239000004615 ingredient Substances 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- FMFKNGWZEQOWNK-UHFFFAOYSA-N 1-butoxypropan-2-yl 2-(2,4,5-trichlorophenoxy)propanoate Chemical compound CCCCOCC(C)OC(=O)C(C)OC1=CC(Cl)=C(Cl)C=C1Cl FMFKNGWZEQOWNK-UHFFFAOYSA-N 0.000 description 1
- 241000010972 Ballerus ballerus Species 0.000 description 1
- VYZAMTAEIAYCRO-UHFFFAOYSA-N Chromium Chemical compound [Cr] VYZAMTAEIAYCRO-UHFFFAOYSA-N 0.000 description 1
- 230000003466 anti-cipated effect Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 238000013523 data management Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 238000007667 floating Methods 0.000 description 1
- 239000003292 glue Substances 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000013515 script Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/061—Improving I/O performance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0614—Improving the reliability of storage systems
- G06F3/0617—Improving the reliability of storage systems in relation to availability
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0629—Configuration or reconfiguration of storage systems
- G06F3/0635—Configuration or reconfiguration of storage systems by changing the path, e.g. traffic rerouting, path reconfiguration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0646—Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
- G06F3/0647—Migration mechanisms
- G06F3/0649—Lifecycle management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/067—Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1415—Saving, restoring, recovering or retrying at system level
- G06F11/1435—Saving, restoring, recovering or retrying at system level using file system or storage system metadata
Definitions
- the present disclosure relates generally to managing data, and more particularly, but not limited to, optimally managing heterogeneous data in a distributed storage environment in real time.
- the main problem is how to efficiently make use of data resources spread across servers available as a single pool of resources for data processing applications, i.e., how to deal with extremely large datasets (for example, archived official datasets for a company, video surveillance data, web crawled data for search engine) which may be only unstructured data and are continuously expanding with time.
- extremely large datasets for example, archived official datasets for a company, video surveillance data, web crawled data for search engine
- the method includes initializing a first storage pool capable of storing data from one or more sources, the first storage pool being distributed across one or more computing devices; storing data from the one or more sources in the first storage pool; generating one or more memory pools in a second storage pool based on amount of data to be stored in the first storage pool and one or more parameters associated with the data stored in the first storage pool; and creating metadata in a first memory pool of the one or more memory pools for the data stored in the first storage pool, the metadata capable of retrieving the data stored in the first storage pool in real-time.
- a system for optimally managing heterogeneous data in a distributed storage environment includes one or more hardware processors and a computer-readable medium storing instructions that, when executed by the one or more hardware processors, cause the one or more hardware processors to perform operations.
- the operations may include initializing a first storage pool capable of storing data from one or more sources, the first storage pool being distributed across one or more computing devices; storing data from the one or more sources in the first storage pool; generating one or more memory pools in a second storage pool based on amount of data to be stored in the first storage pool and one or more parameters associated with the data stored in the first storage pool, the second storage pool being distributed across the one or more computing devices; and creating metadata in a first memory pool of the one or more memory pools for the data stored in the first storage pool, the metadata capable of retrieving the data stored in the first storage pool in real-time.
- a non-transitory computer-readable medium storing instructions for optimally managing heterogeneous data in a distributed storage environment that, when executed by the one or more hardware processors, cause the one or more hardware processors to perform operations is disclosed.
- the operations may include initializing a first storage pool capable of storing data from one or more sources, the first storage pool being distributed across one or more computing devices; storing data from one or more sources in the first storage pool; generating one or more memory pools in a second storage pool based on amount of data to be stored in the first storage pool and one or more parameters associated with the data stored in the first storage pool, the second storage pool being distributed across the one or more computing devices; and creating metadata in a first memory pool of the one or more memory pools for the data stored in the first storage pool, the metadata capable of retrieving the data stored in the first storage pool in real-time.
- the terms “comprise,” “comprises,” “comprising,” “includes,” “including,” “has,” “having,” “contains,” or “containing,” or any other variation thereof, are intended to cover a non-exclusive inclusion.
- a composition, process, method, article, system, apparatus, etc. that comprises a list of elements is not necessarily limited to only those elements but may include other elements not expressly listed.
- the present disclosure relates to a system and a method for leveraging a combination of distributed memory management and the distributed data management.
- a centralized memory spread across systems is integrated with a persistent storage spread across the systems where data can reside efficiently with custom built algorithms for efficient access to the data.
- the system is based on a centralized and distributed architecture, where a data index or metadata (e.g. location of data, access rights etc.) is placed centrally in one or more memory pools and actual data is stored locally across multiple systems, contributing part of one or more memory pools.
- a data index or metadata e.g. location of data, access rights etc.
- each system knows the location of all files present while holding a small part of the data. Similar data is categorized together. This helps the system to browse across similar categories efficiently.
- a user wants a particular data, he only needs to access the metadata which is stored centrally in the one or more memory pools. Since the metadata has information about the location of data present across systems, a stream of requested data is redirected back to the user instantly.
- FIG. 1 illustrates a network environment 100 incorporating a system 102 for optimally managing data in real time among a plurality of devices 104 in a network 106, according to some embodiments of the present disclosure.
- the system 102 may be implemented in a variety of computing systems, such as a laptop computer, a desktop computer, a notebook, a workstation, a mainframe computer, a server, a network server, and the like. Further, as shown in Figure 1 , the plurality of devices 104-1, 104-2, 104-3, 104-N are communicatively coupled to each other and to the system 102 through the network 106 for facilitating one or more end users to access and/or operate the system 102.
- the system 102 may aggregate physical memory of the plurality of devices 104-1, 104-2, 104-3, 104-N, collectively referred to as devices 104 and individually referred to as device 104, to create a pool of memory resources.
- the devices 104 include, but are not limited to, a desktop computer, a portable computer, a server, a handheld device, and a workstation.
- the devices 104 may be used by various stakeholders or end users, such as system administrators and application developers.
- the system 102 may be configured in at least one of the device 104 to aggregate the memory of the plurality of devices 104.
- the network 106 may be a wireless network, wired network or a combination thereof.
- the network 106 can be implemented as one of the different types of networks, such as intranet, local area network (LAN), wide area network (WAN), the internet, and such.
- the network 106 may either be a dedicated network or a shared network, which represents an association of the different types of networks that use a variety of protocols, for example, Hypertext Transfer Protocol (HTTP), Transmission Control Protocol/Internet Protocol (TCP/IP), Wireless Application Protocol (WAP), etc., to communicate with each other.
- HTTP Hypertext Transfer Protocol
- TCP/IP Transmission Control Protocol/Internet Protocol
- WAP Wireless Application Protocol
- the network 106 may include a variety of network devices, including routers, bridges, servers, computing devices, storage devices, etc.
- the system 102 may include a processor (not shown in FIG. 1 ), a memory (not shown in FIG. 1 ) coupled to the processor, and interfaces (not shown in FIG. 1 ).
- the processor may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions.
- the processor is configured to fetch and execute computer-readable instructions stored in the memory.
- the memory can include any non-transitory computer-readable medium known in the art including, for example, volatile memory (e.g., RAM), and/or non-volatile memory (e.g., EPROM, flash memory, etc.).
- the interface(s) may include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, etc., allowing the system 102 to interact with the devices 104. Further, the interface(s) may enable the system 102 respectively to communicate with other computing devices.
- the interface(s) can facilitate multiple communications within a wide variety of networks and protocol types, including wired networks, for example LAN, cable, etc., and wireless networks such as WLAN, cellular, or satellite.
- the interface(s) may include one or more ports for connecting a number of devices to each other or to another server.
- the system 102 may include a centralized memory 108, a storage repository 110, a metadata serialization manager 112, a storage mapping module 114, a data operation catalogue 116, a combined sharding engine 118, and a data passage interface 120.
- the centralized memory 108 may be volatile random access memory (RAM) whereas the storage repository may be nonvolatile persistent storage like hard disk drive or solid state drive devices.
- RAM volatile random access memory
- the storage repository may be nonvolatile persistent storage like hard disk drive or solid state drive devices.
- each server having 20GB of HDD/SDD storage.
- Each server may contribute 4GB of HDD/SDD towards a unified pool of memory resources having a total memory of 80GB.
- the 80GB memory results out of an aggregation of physical memory resources of each of the servers.
- Both the centralized memory 108 and the storage repository 110 may be spread across the devices 104 which may be connected to each other and contributing to be a part of the pooled memory resources.
- a plurality of services run on the devices 104 to which all the memory resources present in various devices 104 are exposed. These services may be used to glue all the memory spread across devices 104 and make it available as single pool of memory, for example, the centralized memory 108 for memory-intensive applications to run efficiently.
- the service may be installed in the system automatically.
- the service may expose the memory in the device 104 in which it is running. All the exposed memory resources spread across the devices 104 may be linked. After linking all the memory resources exposed by the services, the exposed memory resources may be made available as single pool of unified resources, specifically memory.
- the centralized memory 108 may comprise one or more memory pools generated by a communicatively coupled metadata serialization manager 112.
- the one or more generated memory pools may comprise a metadata memory pool 124 and cached data store memory pool 126.
- Each device 104 contributing to be the part of centralized memory 108 has an underlying persistent storage space whose storage capacity is usually more than 100x larger than the memory contributing to be part of the centralized memory 108.
- This storage may be used for high volume data repository 110 handling heterogeneous varieties of data spread across persistent storage in the devices 104.
- the details of each file stored in the distributed repository 110 may be updated in the metadata stored in the metadata memory pool 124.
- the updation of the metadata happens by collecting details from the storage mapping module 114 by the metadata serialization manager 112.
- the metadata serialization manager 112 may accept details of the data stored in persistent storage and updates the metadata stored in the centralized memory 108. After getting the updates, it may access data maps from the storage mapping module 114 for each file which is then serialized and converts it into memory objects for in-memory access. Data maps may comprise the combined details of each file stored in the storage repository 110. The metadata serialized objects may be also written in file system in the storage repository 110 redundantly in defined time intervals for recovering from failures. The metadata details may be captured from the storage mapping module 114. The details may include following parameters which can increase over time:- Location of actual data files spread across multiple devices 104, file permission details, concurrent access rights and exclusive access details, data criticality (no. of data copies to maintain across the devices 104 for fault tolerance), data shard details, file status (deleted, writing in progress, streaming data, file cached in the cached data store memory pool or not or pending for deletion etc.)
- the storage mapping module 114 may be designed for taking updates from the data operation catalogue module 116, the storage repository 110, and centralized memory 108 to create an overall map of data stored in persistent storage in the storage repository 110.
- the created map contains details captured from all the modules it is interacting with.
- the map may comprise an index of all the operations performed on the given data set with the help of data operation catalogue module 116.
- the map may comprise the current location and details about all the copies of the given data set using storage repository 110 and also when the given data set is cached in cached data store memory pool 126 the same is updated in the map.
- This map may be forwarded to the metadata serialization manager 112 for updating the overall metadata status in the metadata memory pool.
- the caching status of each file may be available with the storage mapping module 114 since it has direct access to the storage repository 110 and cached data store memory pool 126.
- the data passage interface 120 may be a single point of contact in the system 102 where an end user can interact with the system 102.
- the data passage interface 120 may be responsible for handling all the input and output operations happening in the system 102. Whenever there is a read request for any file stored in the system, it is catered via the data passage interface 120 which interacts with metadata stored in the metadata memory pool 124 using the metadata serialization framework 112 and the storage mapping module 114. Despite the data is present in cached data store memory pool 126, same may be served to the data passage interface 120 using the storage mapping module 114. This interface may also provide frequency of access of each file to the storage mapping module 114 which helps in caching frequently accessed data from the cached data store memory pool 126 in the centralized memory 108 for even faster access.
- All tasks related to data writing in the distributed storage layer in the storage repository 110 may be handled by the combined sharding module 118.
- This module has an intelligent sharding engine which may detect the type of file and categorize them accordingly. A user may also specify a custom type which can help him/her to categorize files according to his need. If a file type is not detected by this module or a user has enforced no file type detection then the file is directly stored in one of the available persistent storage with ample space.
- the sharding details and location of file in persistent storage may be forwarded to the data operation catalogue module 116 which may help in updating the storage mapping module 114 and the overall metadata of the storage repository 110.
- a real-time data repository 110 should provide enough performance which qualifies to be real-time and still provide most of the I/O operations on its data.
- the data operation catalogue module 116 may be responsible for handling and processing all the I/O operations performed in the system.
- the data operation catalogue module 116 may majorly provide upload, download, delete and append operations. Operations are not limited by these four types and can extend over time.
- a user uploads a file in the data passage interface 120 it may be routed via the combined sharding module 118 which decides when to place the file according to the shard hints given or the file type uploaded.
- a user may also specify the data criticality of the file so that the system 102 may redundantly place multiple copies of same file in the storage repository 110 for fault tolerance.
- the same information (including the location of multiple copies) may be updated in the metadata memory pool 124 using the storage mapping module 114 and the metadata serialization manager 112.
- the data passage interface 120 may interact with the storage mapping module (after getting storage location from the metadata serialization manager 112) for nearest copy of requested data which can be served to the requester.
- the data operation catalogue module 116 may include a data access index creator 128 and a metadata integrator 130.
- the system 102 may handle multiple users at the same time for accessing same data. Accessing same data may cause corruption when more than one user is updating the same file.
- the storage repository 110 has a connection to centralized memory pools which has an API module exposed to cater the access mechanism for each file stored in the storage repository 110. It uses semaphores and mutual exclusion (mutex) where multiple users can effectively access files concurrently based on the status of semaphore and mutex index. In an exemplary embodiment, if a file has a semaphore value of five then the repository would allow five concurrent users to access same file without corruption.
- Mutual exclusion provides an exclusive access to file which only one user can access. Mutex is mostly required for update operations on the stored data.
- the work of the metadata integrator may be to aggregate access details about each file stored in the storage repository 110 and update the same in metadata using storage mapping module 114 and the metadata serialization manager 112. To avoid data corruption by allowing more than defined access to a single file, the metadata integrator may actively update the status of each file stored in the system since users may access files continuously and access index can increase and decrease frequently.
- FIG. 2 is a flowchart of an exemplary method for optimally managing data among a plurality of devices 104 in a network 106 in real time according to some embodiments of the present disclosure. The method may be executed by the system 100 as described in further detail below. It is noted however, the functions and/or steps of FIG. 2 as implemented by system 102 may be provided by different architectures and/or implementations without departing from the scope of the present disclosure.
- step 200 identify the heterogeneous data sources and initialize the storage repository 110 in real time.
- the instructions may be given to the system 102 for initializing the storage repository 110.
- the initialization starts all the required services on the devices 104 and provides information parameters of the centralized memory 108 and total persistent storage, i.e. the storage repository 110. After the services have been initialized and capacity details are circulated across the servers, the storage repository 110 may be made available for use.
- the data loading may be started after creating one or more memory pools in the centralized memory 108.
- the one or more memory pools may comprise metadata memory pool 124 and cached data store memory pool 126.
- the metadata memory pool 124 and cached data store memory pool 126 are defined as follows:
- Metadata memory pool- The storage repository 110 is serving requests in real-time with instant access to any file requested among a large collection of heterogeneous files. This is achieved by various mechanisms built in the storage repository 110 like caching mechanism, in-memory access to file location, dedicated metadata memory pools of large size for proper working and access to the persistent storage etc. All these mechanism details are accessed in fraction of seconds for serving real-time requests by storing them in the centralized memory 108 in form of metadata.
- the metadata are the serialized memory objects which hold the details of each file stored in the storage repository 110, location of nearest copy, state of file, no. of users connected, cached file details, no. of memory pools used etc.
- Cached Data Store memory pool is cache storage for files which are frequently accessed by the system 102. This may be required to reduce overall load on the storage repository 110 only. After a defined frequency of access, the data passage interface 120 may instruct the storage mapping module 114 to cache a given file for even faster access. The status of file may be then updated in the metadata memory pool 124 and the next access to the same file may be served from the cache data store memory pool 126 without bothering the storage repository only.
- the number and size of these memory pools are based on the amount of data to be loaded in the system and various other parameters like future memory requirements, data access policies, data redundancy details etc.
- One of the Memory pool i.e. metadata memory pool 124 is dedicated for handling the metadata for all the data stored in the storage repository 110.
- This metadata has all the details about the data stored.
- the metadata includes the location of the data, access policies, number of copies, status of the data, retention period etc. Also it has the details about the nearest copy of the data across the devices 104 to serve in lowest time possible. All the operations performed in the storage repository 110 are handled by the APIs provided.
- the storage repository 110 may be serving heterogeneous data in real-time. This speed may be maintained with the help of various memory pools present in the centralized memory 108.
- One of the memory pools, i.e., metadata memory pool 124 handles the metadata of all the data in the stored in the storage repository 110.
- Persistent Storage may be the hardware storage (HDD/SSD etc) devices present in all the devices 104 for storing data. Also whenever this storage repository 110 is restarted, instead of creating the metadata index again it may directly read this information flushed to the persistent storage and validates against it for any changes. This saves the overall time and make it more efficient.
- devices 104 may fail from time to time which is collectively the part of heterogeneous storage repository 110. This will result in overall change in the available resources, specifically the overall available resources will reduce by the amount these failed systems were providing to the repository. Whenever there may be a change in the configuration of a system when failure happens, the update may be distributed instantly to all other devices 104 to update them accordingly. Thus the current state of the storage repository 110 may be always transparent to the user to avoid any unknown failures, data corruption or data loss.
- step 206 perform the file operation processes and update the corresponding metadata.
- One of the operations is an uploading file operation. Whenever a file is uploaded in the storage repository 110, based on the available space the file is stored in the appropriate location. During upload only the user has to specify the criticality of the file so that the storage repository can place multiple copies of the given file across the devices 104. The details are then updated to metadata.
- Another operation is a downloading file operation.
- the system 102 may automatically identify the nearest copy of the requested file across the devices 104 using the in-memory metadata. It may then redirect the requester to that file copy.
- Yet another operation is a deleting/updating operation.
- a file is deleted or updated in the real-time file system
- the deleted or older version of file is maintained at a separate location and the same is updated in in-memory metadata.
- the retention period of the deleted data can be configured in the repository.
- step 208 configure distributed access of data sources using semaphores and mutex API.
- the files stored in the storage repository 110 may be configured to be accessed in real-time. Also, there may be a requirement when a single file stored in the persistent storage may be accessed by multiple users of the storage repository 110. This kind of access is fine when there is only a read request by the users. But in cases where multiple users are accessing it for read and write operations, the data might get corrupted. To handle such situations an APIs module may be provided to configure distributed access in persistent storage also.
- the distributed access uses the concept of semaphores and mutual exclusion for defining access. Once a user acquires a mutex lock over a file it can't be accessed by any other users and a wait flag is given to them.
- semaphores are used.
- the number of semaphores defines the number of users. For example, when a user has got a semaphore access the total count of semaphore is reduced by one thus the number of simultaneous access is also reduced. When the operation is finished again the semaphore index is increased by one. All the distributed related details are accessed and updated using metadata access.
- step 210 rectify the errors arising using the metadata.
- the information is quickly circulated to the metadata memory pool 124 which then initiates the creation of new copies whose redundancy has decreased after the system failure. Also the overall state and space available in the storage repository is updated in the metadata.
- FIG. 3 is a block diagram of an exemplary computer system for implementing embodiments consistent with the present disclosure. Variations of computer system 301 may be used for implementing any of the devices and/or device components presented in this disclosure, including system 102.
- Computer system 301 may comprise a central processing unit (CPU or processor) 302.
- Processor 302 may comprise at least one data processor for executing program components for executing user- or system-generated requests.
- a user may include a person using a device such as such as those included in this disclosure or such a device itself.
- the processor may include specialized processing units such as integrated system (bus) controllers, memory management control units, floating point units, graphics processing units, digital signal processing units, etc.
- the processor may include a microprocessor, such as AMD Athlon, Duron or Opteron, ARM's application, embedded or secure processors, IBM PowerPC, Intel's Core, Itanium, Xeon, Celeron or other line of processors, etc.
- the processor 302 may be implemented using mainframe, distributed processor, multi-core, parallel, grid, or other architectures. Some embodiments may utilize embedded technologies like application-specific integrated circuits (ASICs), digital signal processors (DSPs), Field Programmable Gate Arrays (FPGAs), etc.
- ASICs application-specific integrated circuits
- DSPs digital signal processors
- FPGAs Field Programmable Gate Arrays
- I/O Processor 302 may be disposed in communication with one or more input/output (I/O) devices via I/O interface 303.
- the I/O interface 303 may employ communication protocols/methods such as, without limitation, audio, analog, digital, monaural, RCA, stereo, IEEE-1394, serial bus, universal serial bus (USB), infrared, PS/2, BNC, coaxial, component, composite, digital visual interface (DVI), high-definition multimedia interface (HDMI), RF antennas, S-Video, VGA, IEEE 802.n /b/g/n/x, Bluetooth, cellular (e.g., code-division multiple access (CDMA), high-speed packet access (HSPA+), global system for mobile communications (GSM), long-term evolution (LTE), WiMax, or the like), etc.
- CDMA code-division multiple access
- HSPA+ high-speed packet access
- GSM global system for mobile communications
- LTE long-term evolution
- WiMax wireless wide area network
- the computer system 301 may communicate with one or more I/O devices.
- the input device 304 may be an antenna, keyboard, mouse, joystick, (infrared) remote control, camera, card reader, fax machine, dongle, biometric reader, microphone, touch screen, touchpad, trackball, sensor (e.g., accelerometer, light sensor, GPS, gyroscope, proximity sensor, or the like), stylus, scanner, storage device, transceiver, video device/source, visors, etc.
- Output device 305 may be a printer, fax machine, video display (e.g., cathode ray tube (CRT), liquid crystal display (LCD), light-emitting diode (LED), plasma, or the like), audio speaker, etc.
- video display e.g., cathode ray tube (CRT), liquid crystal display (LCD), light-emitting diode (LED), plasma, or the like
- audio speaker etc.
- a transceiver 306 may be disposed in connection with the processor 302. The transceiver may facilitate various types of wireless transmission or reception.
- the transceiver may include an antenna operatively connected to a transceiver chip (e.g., Texas Instruments WiLink WL1283, Broadcom BCM4750IUB8, Infineon Technologies X-Gold 518-PMB9800, or the like), providing IEEE 802.11a/b/g/n, Bluetooth, FM, global positioning system (GPS), 2G/3G HSDPA/HSUPA communications, etc.
- a transceiver chip e.g., Texas Instruments WiLink WL1283, Broadcom BCM4750IUB8, Infineon Technologies X-Gold 518-PMB9800, or the like
- IEEE 802.11a/b/g/n e.g., Texas Instruments WiLink WL1283, Broadcom BCM4750IUB8, Infineon Technologies X-Gold 518-PMB9800, or the like
- IEEE 802.11a/b/g/n e.g., Bluetooth, FM, global positioning system (GPS), 2G/3G HSDPA/HS
- the processor 302 may be disposed in communication with a communication network 308 via a network interface 307.
- the network interface 307 may communicate with the communication network 308.
- the network interface may employ connection protocols including, without limitation, direct connect, Ethernet (e.g., twisted pair 10/100/1000 Base T), transmission control protocol/internet protocol (TCP/IP), token ring, IEEE 802.11a/b/g/n/x, etc.
- the communication network 308 may include, without limitation, a direct interconnection, local area network (LAN), wide area network (WAN), wireless network (e.g., using Wireless Application Protocol), the Internet, etc.
- LAN local area network
- WAN wide area network
- wireless network e.g., using Wireless Application Protocol
- These devices may include, without limitation, personal computer(s), server(s), fax machines, printers, scanners, various mobile devices such as cellular telephones, smartphones (e.g., Apple iPhone, Blackberry, Android-based phones, etc.), tablet computers, eBook readers (Amazon Kindle, Nook, etc.), laptop computers, notebooks, gaming consoles (Microsoft Xbox, Nintendo DS, Sony PlayStation, etc.), or the like.
- the computer system 301 may itself embody one or more of these devices.
- the processor 302 may be disposed in communication with one or more memory devices (e.g., RAM 313, ROM 314, etc.) via a storage interface 312.
- the storage interface may connect to memory devices including, without limitation, memory drives, removable disc drives, etc., employing connection protocols such as serial advanced technology attachment (SATA), integrated drive electronics (IDE), IEEE-1394, universal serial bus (USB), fiber channel, small computer systems interface (SCSI), etc.
- the memory drives may further include a drum, magnetic disc drive, magnetooptical drive, optical drive, redundant array of independent discs (RAID), solid-state memory devices, solid-state drives, etc.
- the memory devices may store a collection of program or database components, including, without limitation, an operating system 316, user interface application 317, web browser 318, mail server 319, mail client 320, user/application data 321 (e.g., any data variables or data records discussed in this disclosure), etc.
- the operating system 316 may facilitate resource management and operation of the computer system 301.
- Operating systems include, without limitation, Apple Macintosh OS X, Unix, Unix-like system distributions (e.g., Berkeley Software Distribution (BSD), FreeBSD, NetBSD, OpenBSD, etc.), Linux distributions (e.g., Red Hat, Ubuntu, Kubuntu, etc.), IBM OS/2, Microsoft Windows (XP, Vista/7/8, etc.), Apple iOS, Google Android, Blackberry OS, or the like.
- User interface 317 may facilitate display, execution, interaction, manipulation, or operation of program components through textual or graphical facilities.
- user interfaces may provide computer interaction interface elements on a display system operatively connected to the computer system 301, such as cursors, icons, check boxes, menus, scrollers, windows, widgets, etc.
- GUIs Graphical user interfaces
- GUIs may be employed, including, without limitation, Apple Macintosh operating systems' Aqua, IBM OS/2, Microsoft Windows (e.g., Aero, Metro, etc.), Unix X-Windows, web interface libraries (e.g., ActiveX, Java, Javascript, AJAX, HTML, Adobe Flash, etc.), or the like.
- the computer system 301 may implement a web browser 318 stored program component.
- the web browser may be a hypertext viewing application, such as Microsoft Internet Explorer, Google Chrome, Mozilla Firefox, Apple Safari, etc. Secure web browsing may be provided using HTTPS (secure hypertext transport protocol), secure sockets layer (SSL), Transport Layer Security (TLS), etc. Web browsers may utilize facilities such as AJAX, DHTML, Adobe Flash, JavaScript, Java, application programming interfaces (APIs), etc.
- the computer system 301 may implement a mail server 319 stored program component.
- the mail server may be an Internet mail server such as Microsoft Exchange, or the like.
- the mail server may utilize facilities such as ASP, ActiveX, ANSI C++/C#, Microsoft .NET, CGI scripts, Java, JavaScript, PERL, PHP, Python, WebObjects, etc.
- the mail server may utilize communication protocols such as internet message access protocol (IMAP), messaging application programming interface (MAPI), Microsoft Exchange, post office protocol (POP), simple mail transfer protocol (SMTP), or the like.
- IMAP internet message access protocol
- MAPI messaging application programming interface
- POP post office protocol
- SMTP simple mail transfer protocol
- the computer system 301 may implement a mail client 320 stored program component.
- the mail client may be a mail viewing application, such as Apple Mail, Microsoft Entourage, Microsoft Outlook, Mozilla Thunderbird, etc.
- computer system 301 may store user/application data 321, such as the data, variables, records, etc. as described in this disclosure.
- databases may be implemented as fault-tolerant, relational, scalable, secure databases such as Oracle or Sybase.
- databases may be implemented using standardized data structures, such as an array, hash, linked list, struct, structured text file (e.g., XML), table, or as object-oriented databases (e.g., using ObjectStore, Poet, Zope, etc.).
- object-oriented databases e.g., using ObjectStore, Poet, Zope, etc.
- Such databases may be consolidated or distributed, sometimes among the various computer systems discussed above in this disclosure. It is to be understood that the structure and operation of the any computer or database component may be combined, consolidated, or distributed in any working combination.
- a computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored.
- a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein.
- the term "computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
- The present disclosure relates generally to managing data, and more particularly, but not limited to, optimally managing heterogeneous data in a distributed storage environment in real time.
- In a present day scenario, considering the exponential increase in data storage requirements and the drastic reduction in storage cost per gigabyte, aggressive optimization of data written to storage media is not seriously undertaken. This leads to generation of extremely large unmanaged datasets distributed across multiple systems, cloud based systems, and various other places. Querying these large datasets entails efforts as it is not known which data lies where. Further, there are no mechanisms available for creating a common index for pulling out data from extremely large datasets spread across various systems and for handling such data efficiently.
- Currently, the data is spread across multiple servers which are interconnected with each other. Various techniques are being developed to leverage the collective power of all the interconnected servers. The main problem is how to efficiently make use of data resources spread across servers available as a single pool of resources for data processing applications, i.e., how to deal with extremely large datasets (for example, archived official datasets for a company, video surveillance data, web crawled data for search engine) which may be only unstructured data and are continuously expanding with time. The main problems associated with such kind of data are as follows:
- Lack of proper centralized yet distributed storage space;
- Lack of computing services available for the given data size;
- No proper access mechanism.
- Data is unstructured; and/or
- Access is very low.
- In view of the above drawbacks, it would be desirable to have a mechanism to use large datasets spread across systems in an efficient and fault tolerant manner in real time.
- Disclosed herein is a method for optimally managing heterogeneous data in a distributed storage environment. The method includes initializing a first storage pool capable of storing data from one or more sources, the first storage pool being distributed across one or more computing devices; storing data from the one or more sources in the first storage pool; generating one or more memory pools in a second storage pool based on amount of data to be stored in the first storage pool and one or more parameters associated with the data stored in the first storage pool; and creating metadata in a first memory pool of the one or more memory pools for the data stored in the first storage pool, the metadata capable of retrieving the data stored in the first storage pool in real-time.
- In another aspect of the invention, a system for optimally managing heterogeneous data in a distributed storage environment is disclosed. The system includes one or more hardware processors and a computer-readable medium storing instructions that, when executed by the one or more hardware processors, cause the one or more hardware processors to perform operations. The operations may include initializing a first storage pool capable of storing data from one or more sources, the first storage pool being distributed across one or more computing devices; storing data from the one or more sources in the first storage pool; generating one or more memory pools in a second storage pool based on amount of data to be stored in the first storage pool and one or more parameters associated with the data stored in the first storage pool, the second storage pool being distributed across the one or more computing devices; and creating metadata in a first memory pool of the one or more memory pools for the data stored in the first storage pool, the metadata capable of retrieving the data stored in the first storage pool in real-time.
- In yet another aspect of the invention, a non-transitory computer-readable medium storing instructions for optimally managing heterogeneous data in a distributed storage environment that, when executed by the one or more hardware processors, cause the one or more hardware processors to perform operations is disclosed. The operations may include initializing a first storage pool capable of storing data from one or more sources, the first storage pool being distributed across one or more computing devices; storing data from one or more sources in the first storage pool; generating one or more memory pools in a second storage pool based on amount of data to be stored in the first storage pool and one or more parameters associated with the data stored in the first storage pool, the second storage pool being distributed across the one or more computing devices; and creating metadata in a first memory pool of the one or more memory pools for the data stored in the first storage pool, the metadata capable of retrieving the data stored in the first storage pool in real-time.
- Additional objects and advantages of the present disclosure will be set forth in part in the following detailed description, and in part will be obvious from the description, or may be learned by practice of the present disclosure. The objects and advantages of the present disclosure will be realized and attained by means of the elements and combinations particularly pointed out in the appended claims.
- The accompanying drawings, which constitute a part of this specification, illustrate several embodiments and, together with the description, serve to explain the disclosed principles. In the drawings:
-
FIG. 1 illustrates a network environment incorporating a system for optimally managing heterogeneous data in real time among a plurality of devices in a network, according to some embodiments of the present disclosure. -
FIG. 2 is a flowchart of an exemplary method for optimally managing heterogeneous data among the plurality of devices in the network in real time, according to some embodiments of the present disclosure. -
FIG. 3 is a block diagram of an exemplary computer system for implementing embodiments consistent with the present disclosure. - As used herein, reference to an element by the indefinite article "a" or "an" does not exclude the possibility that more than one of the element is present, unless the context requires that there is one and only one of the elements. The indefinite article "a" or "an" thus usually means "at least one." The disclosure of numerical ranges should be understood as referring to each discrete point within the range, inclusive of endpoints, unless otherwise noted.
- As used herein, the terms "comprise," "comprises," "comprising," "includes," "including," "has," "having," "contains," or "containing," or any other variation thereof, are intended to cover a non-exclusive inclusion. For example, a composition, process, method, article, system, apparatus, etc. that comprises a list of elements is not necessarily limited to only those elements but may include other elements not expressly listed. The terms "consist of," "consists of," "consisting of," or any other variation thereof, excludes any element, step, or ingredient, etc., not specified. The term "consist essentially of," "consists essentially of," "consisting essentially of," or any other variation thereof, permits the inclusion of elements, steps, or ingredients, etc., not listed to the extent they do not materially affect the basic and novel characteristic(s) of the claimed subject matter.
- The present disclosure relates to a system and a method for leveraging a combination of distributed memory management and the distributed data management. A centralized memory spread across systems is integrated with a persistent storage spread across the systems where data can reside efficiently with custom built algorithms for efficient access to the data.
- The system is based on a centralized and distributed architecture, where a data index or metadata (e.g. location of data, access rights etc.) is placed centrally in one or more memory pools and actual data is stored locally across multiple systems, contributing part of one or more memory pools. In this way, each system knows the location of all files present while holding a small part of the data. Similar data is categorized together. This helps the system to browse across similar categories efficiently. When a user wants a particular data, he only needs to access the metadata which is stored centrally in the one or more memory pools. Since the metadata has information about the location of data present across systems, a stream of requested data is redirected back to the user instantly.
-
FIG. 1 illustrates anetwork environment 100 incorporating asystem 102 for optimally managing data in real time among a plurality ofdevices 104 in anetwork 106, according to some embodiments of the present disclosure. - The
system 102 may be implemented in a variety of computing systems, such as a laptop computer, a desktop computer, a notebook, a workstation, a mainframe computer, a server, a network server, and the like. Further, as shown inFigure 1 , the plurality of devices 104-1, 104-2, 104-3, 104-N are communicatively coupled to each other and to thesystem 102 through thenetwork 106 for facilitating one or more end users to access and/or operate thesystem 102. - Further, the
system 102 may aggregate physical memory of the plurality of devices 104-1, 104-2, 104-3, 104-N, collectively referred to asdevices 104 and individually referred to asdevice 104, to create a pool of memory resources. Examples of thedevices 104 include, but are not limited to, a desktop computer, a portable computer, a server, a handheld device, and a workstation. Thedevices 104 may be used by various stakeholders or end users, such as system administrators and application developers. In one implementation, thesystem 102 may be configured in at least one of thedevice 104 to aggregate the memory of the plurality ofdevices 104. - The
network 106 may be a wireless network, wired network or a combination thereof. Thenetwork 106 can be implemented as one of the different types of networks, such as intranet, local area network (LAN), wide area network (WAN), the internet, and such. Thenetwork 106 may either be a dedicated network or a shared network, which represents an association of the different types of networks that use a variety of protocols, for example, Hypertext Transfer Protocol (HTTP), Transmission Control Protocol/Internet Protocol (TCP/IP), Wireless Application Protocol (WAP), etc., to communicate with each other. Further, thenetwork 106 may include a variety of network devices, including routers, bridges, servers, computing devices, storage devices, etc. - The
system 102 may include a processor (not shown inFIG. 1 ), a memory (not shown inFIG. 1 ) coupled to the processor, and interfaces (not shown inFIG. 1 ). The processor may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the processor is configured to fetch and execute computer-readable instructions stored in the memory. The memory can include any non-transitory computer-readable medium known in the art including, for example, volatile memory (e.g., RAM), and/or non-volatile memory (e.g., EPROM, flash memory, etc.). - The interface(s) may include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, etc., allowing the
system 102 to interact with thedevices 104. Further, the interface(s) may enable thesystem 102 respectively to communicate with other computing devices. The interface(s) can facilitate multiple communications within a wide variety of networks and protocol types, including wired networks, for example LAN, cable, etc., and wireless networks such as WLAN, cellular, or satellite. The interface(s) may include one or more ports for connecting a number of devices to each other or to another server. - As shown in
FIG. 1 , thesystem 102 may include acentralized memory 108, astorage repository 110, ametadata serialization manager 112, astorage mapping module 114, adata operation catalogue 116, a combinedsharding engine 118, and adata passage interface 120. Thecentralized memory 108 may be volatile random access memory (RAM) whereas the storage repository may be nonvolatile persistent storage like hard disk drive or solid state drive devices. In an exemplary embodiment, suppose there are 20 servers, each server having 20GB of HDD/SDD storage. Each server may contribute 4GB of HDD/SDD towards a unified pool of memory resources having a total memory of 80GB. The 80GB memory results out of an aggregation of physical memory resources of each of the servers. This aggregation would present itself as a unified pool of memory resources. Both thecentralized memory 108 and thestorage repository 110 may be spread across thedevices 104 which may be connected to each other and contributing to be a part of the pooled memory resources. A plurality of services run on thedevices 104 to which all the memory resources present invarious devices 104 are exposed. These services may be used to glue all the memory spread acrossdevices 104 and make it available as single pool of memory, for example, thecentralized memory 108 for memory-intensive applications to run efficiently. If anew device 104 is installed in thenetwork environment 100, the service may be installed in the system automatically. The service may expose the memory in thedevice 104 in which it is running. All the exposed memory resources spread across thedevices 104 may be linked. After linking all the memory resources exposed by the services, the exposed memory resources may be made available as single pool of unified resources, specifically memory. - The
centralized memory 108 may comprise one or more memory pools generated by a communicatively coupledmetadata serialization manager 112. In an exemplary embodiment, the one or more generated memory pools may comprise ametadata memory pool 124 and cached datastore memory pool 126. - Each
device 104 contributing to be the part ofcentralized memory 108 has an underlying persistent storage space whose storage capacity is usually more than 100x larger than the memory contributing to be part of thecentralized memory 108. This storage may be used for highvolume data repository 110 handling heterogeneous varieties of data spread across persistent storage in thedevices 104. The details of each file stored in the distributedrepository 110 may be updated in the metadata stored in themetadata memory pool 124. The updation of the metadata happens by collecting details from thestorage mapping module 114 by themetadata serialization manager 112. - The
metadata serialization manager 112 may accept details of the data stored in persistent storage and updates the metadata stored in thecentralized memory 108. After getting the updates, it may access data maps from thestorage mapping module 114 for each file which is then serialized and converts it into memory objects for in-memory access. Data maps may comprise the combined details of each file stored in thestorage repository 110. The metadata serialized objects may be also written in file system in thestorage repository 110 redundantly in defined time intervals for recovering from failures. The metadata details may be captured from thestorage mapping module 114. The details may include following parameters which can increase over time:- Location of actual data files spread acrossmultiple devices 104, file permission details, concurrent access rights and exclusive access details, data criticality (no. of data copies to maintain across thedevices 104 for fault tolerance), data shard details, file status (deleted, writing in progress, streaming data, file cached in the cached data store memory pool or not or pending for deletion etc.) - The
storage mapping module 114 may be designed for taking updates from the dataoperation catalogue module 116, thestorage repository 110, andcentralized memory 108 to create an overall map of data stored in persistent storage in thestorage repository 110. The created map contains details captured from all the modules it is interacting with. For example, the map may comprise an index of all the operations performed on the given data set with the help of dataoperation catalogue module 116. Alternatively or in addition, the map may comprise the current location and details about all the copies of the given data set usingstorage repository 110 and also when the given data set is cached in cached datastore memory pool 126 the same is updated in the map. This map may be forwarded to themetadata serialization manager 112 for updating the overall metadata status in the metadata memory pool. The caching status of each file may be available with thestorage mapping module 114 since it has direct access to thestorage repository 110 and cached datastore memory pool 126. - Further, the
data passage interface 120 may be a single point of contact in thesystem 102 where an end user can interact with thesystem 102. Thedata passage interface 120 may be responsible for handling all the input and output operations happening in thesystem 102. Whenever there is a read request for any file stored in the system, it is catered via thedata passage interface 120 which interacts with metadata stored in themetadata memory pool 124 using themetadata serialization framework 112 and thestorage mapping module 114. Despite the data is present in cached datastore memory pool 126, same may be served to thedata passage interface 120 using thestorage mapping module 114. This interface may also provide frequency of access of each file to thestorage mapping module 114 which helps in caching frequently accessed data from the cached datastore memory pool 126 in thecentralized memory 108 for even faster access. - All tasks related to data writing in the distributed storage layer in the
storage repository 110 may be handled by the combinedsharding module 118. This module has an intelligent sharding engine which may detect the type of file and categorize them accordingly. A user may also specify a custom type which can help him/her to categorize files according to his need. If a file type is not detected by this module or a user has enforced no file type detection then the file is directly stored in one of the available persistent storage with ample space. The sharding details and location of file in persistent storage may be forwarded to the dataoperation catalogue module 116 which may help in updating thestorage mapping module 114 and the overall metadata of thestorage repository 110. - A real-
time data repository 110 should provide enough performance which qualifies to be real-time and still provide most of the I/O operations on its data. The dataoperation catalogue module 116 may be responsible for handling and processing all the I/O operations performed in the system. The dataoperation catalogue module 116 may majorly provide upload, download, delete and append operations. Operations are not limited by these four types and can extend over time. - When a user uploads a file in the
data passage interface 120, it may be routed via the combinedsharding module 118 which decides when to place the file according to the shard hints given or the file type uploaded. A user may also specify the data criticality of the file so that thesystem 102 may redundantly place multiple copies of same file in thestorage repository 110 for fault tolerance. After the file upload is completed the same information (including the location of multiple copies) may be updated in themetadata memory pool 124 using thestorage mapping module 114 and themetadata serialization manager 112. - For handling a file request operation, the
data passage interface 120 may interact with the storage mapping module (after getting storage location from the metadata serialization manager 112) for nearest copy of requested data which can be served to the requester. - When an update/append or delete operation is requested in the
data passage interface 120, same operation may be repeated for finding the nearest location of data using thestorage mapping module 114. Here a user may specify whether to retain old copy of data or not. Even if the user has requested not to keep an old copy, still thestorage repository 110 may retain an older version of the data based on the free space available in thestorage repository 110 and a configured amount of time defined in thestorage repository 110. - The data
operation catalogue module 116 may include a data access index creator 128 and a metadata integrator 130. Thesystem 102 may handle multiple users at the same time for accessing same data. Accessing same data may cause corruption when more than one user is updating the same file. For avoiding this problem thestorage repository 110 has a connection to centralized memory pools which has an API module exposed to cater the access mechanism for each file stored in thestorage repository 110. It uses semaphores and mutual exclusion (mutex) where multiple users can effectively access files concurrently based on the status of semaphore and mutex index. In an exemplary embodiment, if a file has a semaphore value of five then the repository would allow five concurrent users to access same file without corruption. Mutual exclusion provides an exclusive access to file which only one user can access. Mutex is mostly required for update operations on the stored data. - The work of the metadata integrator may be to aggregate access details about each file stored in the
storage repository 110 and update the same in metadata usingstorage mapping module 114 and themetadata serialization manager 112. To avoid data corruption by allowing more than defined access to a single file, the metadata integrator may actively update the status of each file stored in the system since users may access files continuously and access index can increase and decrease frequently. -
FIG. 2 is a flowchart of an exemplary method for optimally managing data among a plurality ofdevices 104 in anetwork 106 in real time according to some embodiments of the present disclosure. The method may be executed by thesystem 100 as described in further detail below. It is noted however, the functions and/or steps ofFIG. 2 as implemented bysystem 102 may be provided by different architectures and/or implementations without departing from the scope of the present disclosure. - Referring to
FIG. 2 , atstep 200, identify the heterogeneous data sources and initialize thestorage repository 110 in real time. In this initializing step, once the sources, i.e., thedevices 104 have been identified, the instructions may be given to thesystem 102 for initializing thestorage repository 110. The initialization starts all the required services on thedevices 104 and provides information parameters of thecentralized memory 108 and total persistent storage, i.e. thestorage repository 110. After the services have been initialized and capacity details are circulated across the servers, thestorage repository 110 may be made available for use. - At
step 202, create dynamic memory pools based on the requirements of thestorage repository 110 in real time. Once thestorage repository 110 is available for use, the data loading may be started after creating one or more memory pools in thecentralized memory 108. As discussed earlier, the one or more memory pools may comprisemetadata memory pool 124 and cached datastore memory pool 126. Themetadata memory pool 124 and cached datastore memory pool 126 are defined as follows: - Metadata memory pool- The
storage repository 110 is serving requests in real-time with instant access to any file requested among a large collection of heterogeneous files. This is achieved by various mechanisms built in thestorage repository 110 like caching mechanism, in-memory access to file location, dedicated metadata memory pools of large size for proper working and access to the persistent storage etc. All these mechanism details are accessed in fraction of seconds for serving real-time requests by storing them in thecentralized memory 108 in form of metadata. The metadata are the serialized memory objects which hold the details of each file stored in thestorage repository 110, location of nearest copy, state of file, no. of users connected, cached file details, no. of memory pools used etc. - Cached Data Store memory pool - Cached Data Store memory pool is cache storage for files which are frequently accessed by the
system 102. This may be required to reduce overall load on thestorage repository 110 only. After a defined frequency of access, thedata passage interface 120 may instruct thestorage mapping module 114 to cache a given file for even faster access. The status of file may be then updated in themetadata memory pool 124 and the next access to the same file may be served from the cache datastore memory pool 126 without bothering the storage repository only. - The number and size of these memory pools are based on the amount of data to be loaded in the system and various other parameters like future memory requirements, data access policies, data redundancy details etc. One of the Memory pool, i.e. metadata
memory pool 124 is dedicated for handling the metadata for all the data stored in thestorage repository 110. This metadata has all the details about the data stored. The metadata includes the location of the data, access policies, number of copies, status of the data, retention period etc. Also it has the details about the nearest copy of the data across thedevices 104 to serve in lowest time possible. All the operations performed in thestorage repository 110 are handled by the APIs provided. - At
step 204, create and maintain the metadata memory pool. Thestorage repository 110 may be serving heterogeneous data in real-time. This speed may be maintained with the help of various memory pools present in thecentralized memory 108. One of the memory pools, i.e.,metadata memory pool 124 handles the metadata of all the data in the stored in thestorage repository 110. - All the metadata details which are stored in a dedicated
metadata memory pool 124 may act as the contact point for the files stored in thestorage repository 110. The information stored with metadata is very critical and thestorage repository 110 cannot afford to lose this information. Thus this metadata which is present in themetadata memory pool 124 may be redundantly stored in thestorage repository 110 which comprises the persistent storage. Persistent Storage may be the hardware storage (HDD/SSD etc) devices present in all thedevices 104 for storing data. Also whenever thisstorage repository 110 is restarted, instead of creating the metadata index again it may directly read this information flushed to the persistent storage and validates against it for any changes. This saves the overall time and make it more efficient. - Further,
devices 104 may fail from time to time which is collectively the part ofheterogeneous storage repository 110. This will result in overall change in the available resources, specifically the overall available resources will reduce by the amount these failed systems were providing to the repository. Whenever there may be a change in the configuration of a system when failure happens, the update may be distributed instantly to allother devices 104 to update them accordingly. Thus the current state of thestorage repository 110 may be always transparent to the user to avoid any unknown failures, data corruption or data loss. - At
step 206, perform the file operation processes and update the corresponding metadata. One of the operations is an uploading file operation. Whenever a file is uploaded in thestorage repository 110, based on the available space the file is stored in the appropriate location. During upload only the user has to specify the criticality of the file so that the storage repository can place multiple copies of the given file across thedevices 104. The details are then updated to metadata. - Another operation is a downloading file operation. When a file download is initiated, the
system 102 may automatically identify the nearest copy of the requested file across thedevices 104 using the in-memory metadata. It may then redirect the requester to that file copy. - Yet another operation is a deleting/updating operation. When a file is deleted or updated in the real-time file system, the deleted or older version of file is maintained at a separate location and the same is updated in in-memory metadata. The retention period of the deleted data can be configured in the repository.
- It is to be noted that present disclosure is not limited to above stated four operations. There may be other operations also.
- At
step 208, configure distributed access of data sources using semaphores and mutex API. The files stored in thestorage repository 110 may be configured to be accessed in real-time. Also, there may be a requirement when a single file stored in the persistent storage may be accessed by multiple users of thestorage repository 110. This kind of access is fine when there is only a read request by the users. But in cases where multiple users are accessing it for read and write operations, the data might get corrupted. To handle such situations an APIs module may be provided to configure distributed access in persistent storage also. The distributed access uses the concept of semaphores and mutual exclusion for defining access. Once a user acquires a mutex lock over a file it can't be accessed by any other users and a wait flag is given to them. In situations where multiple accesses may be provided, semaphores are used. The number of semaphores defines the number of users. For example, when a user has got a semaphore access the total count of semaphore is reduced by one thus the number of simultaneous access is also reduced. When the operation is finished again the semaphore index is increased by one. All the distributed related details are accessed and updated using metadata access. - At
step 210, rectify the errors arising using the metadata. In case of an error arising out of failure of one or twodevices 104, the information is quickly circulated to themetadata memory pool 124 which then initiates the creation of new copies whose redundancy has decreased after the system failure. Also the overall state and space available in the storage repository is updated in the metadata. -
FIG. 3 is a block diagram of an exemplary computer system for implementing embodiments consistent with the present disclosure. Variations ofcomputer system 301 may be used for implementing any of the devices and/or device components presented in this disclosure, includingsystem 102.Computer system 301 may comprise a central processing unit (CPU or processor) 302.Processor 302 may comprise at least one data processor for executing program components for executing user- or system-generated requests. A user may include a person using a device such as such as those included in this disclosure or such a device itself. The processor may include specialized processing units such as integrated system (bus) controllers, memory management control units, floating point units, graphics processing units, digital signal processing units, etc. The processor may include a microprocessor, such as AMD Athlon, Duron or Opteron, ARM's application, embedded or secure processors, IBM PowerPC, Intel's Core, Itanium, Xeon, Celeron or other line of processors, etc. Theprocessor 302 may be implemented using mainframe, distributed processor, multi-core, parallel, grid, or other architectures. Some embodiments may utilize embedded technologies like application-specific integrated circuits (ASICs), digital signal processors (DSPs), Field Programmable Gate Arrays (FPGAs), etc. -
Processor 302 may be disposed in communication with one or more input/output (I/O) devices via I/O interface 303. The I/O interface 303 may employ communication protocols/methods such as, without limitation, audio, analog, digital, monaural, RCA, stereo, IEEE-1394, serial bus, universal serial bus (USB), infrared, PS/2, BNC, coaxial, component, composite, digital visual interface (DVI), high-definition multimedia interface (HDMI), RF antennas, S-Video, VGA, IEEE 802.n /b/g/n/x, Bluetooth, cellular (e.g., code-division multiple access (CDMA), high-speed packet access (HSPA+), global system for mobile communications (GSM), long-term evolution (LTE), WiMax, or the like), etc. - Using the I/
O interface 303, thecomputer system 301 may communicate with one or more I/O devices. For example, the input device 304 may be an antenna, keyboard, mouse, joystick, (infrared) remote control, camera, card reader, fax machine, dongle, biometric reader, microphone, touch screen, touchpad, trackball, sensor (e.g., accelerometer, light sensor, GPS, gyroscope, proximity sensor, or the like), stylus, scanner, storage device, transceiver, video device/source, visors, etc.Output device 305 may be a printer, fax machine, video display (e.g., cathode ray tube (CRT), liquid crystal display (LCD), light-emitting diode (LED), plasma, or the like), audio speaker, etc. In some embodiments, atransceiver 306 may be disposed in connection with theprocessor 302. The transceiver may facilitate various types of wireless transmission or reception. For example, the transceiver may include an antenna operatively connected to a transceiver chip (e.g., Texas Instruments WiLink WL1283, Broadcom BCM4750IUB8, Infineon Technologies X-Gold 518-PMB9800, or the like), providing IEEE 802.11a/b/g/n, Bluetooth, FM, global positioning system (GPS), 2G/3G HSDPA/HSUPA communications, etc. - In some embodiments, the
processor 302 may be disposed in communication with acommunication network 308 via anetwork interface 307. Thenetwork interface 307 may communicate with thecommunication network 308. The network interface may employ connection protocols including, without limitation, direct connect, Ethernet (e.g., twisted pair 10/100/1000 Base T), transmission control protocol/internet protocol (TCP/IP), token ring, IEEE 802.11a/b/g/n/x, etc. Thecommunication network 308 may include, without limitation, a direct interconnection, local area network (LAN), wide area network (WAN), wireless network (e.g., using Wireless Application Protocol), the Internet, etc. Using thenetwork interface 307 and thecommunication network 308, thecomputer system 301 may communicate withdevices 309. These devices may include, without limitation, personal computer(s), server(s), fax machines, printers, scanners, various mobile devices such as cellular telephones, smartphones (e.g., Apple iPhone, Blackberry, Android-based phones, etc.), tablet computers, eBook readers (Amazon Kindle, Nook, etc.), laptop computers, notebooks, gaming consoles (Microsoft Xbox, Nintendo DS, Sony PlayStation, etc.), or the like. In some embodiments, thecomputer system 301 may itself embody one or more of these devices. - In some embodiments, the
processor 302 may be disposed in communication with one or more memory devices (e.g.,RAM 313,ROM 314, etc.) via astorage interface 312. The storage interface may connect to memory devices including, without limitation, memory drives, removable disc drives, etc., employing connection protocols such as serial advanced technology attachment (SATA), integrated drive electronics (IDE), IEEE-1394, universal serial bus (USB), fiber channel, small computer systems interface (SCSI), etc. The memory drives may further include a drum, magnetic disc drive, magnetooptical drive, optical drive, redundant array of independent discs (RAID), solid-state memory devices, solid-state drives, etc. - The memory devices may store a collection of program or database components, including, without limitation, an operating system 316, user interface application 317, web browser 318,
mail server 319, mail client 320, user/application data 321 (e.g., any data variables or data records discussed in this disclosure), etc. The operating system 316 may facilitate resource management and operation of thecomputer system 301. Examples of operating systems include, without limitation, Apple Macintosh OS X, Unix, Unix-like system distributions (e.g., Berkeley Software Distribution (BSD), FreeBSD, NetBSD, OpenBSD, etc.), Linux distributions (e.g., Red Hat, Ubuntu, Kubuntu, etc.), IBM OS/2, Microsoft Windows (XP, Vista/7/8, etc.), Apple iOS, Google Android, Blackberry OS, or the like. User interface 317 may facilitate display, execution, interaction, manipulation, or operation of program components through textual or graphical facilities. For example, user interfaces may provide computer interaction interface elements on a display system operatively connected to thecomputer system 301, such as cursors, icons, check boxes, menus, scrollers, windows, widgets, etc. Graphical user interfaces (GUIs) may be employed, including, without limitation, Apple Macintosh operating systems' Aqua, IBM OS/2, Microsoft Windows (e.g., Aero, Metro, etc.), Unix X-Windows, web interface libraries (e.g., ActiveX, Java, Javascript, AJAX, HTML, Adobe Flash, etc.), or the like. - In some embodiments, the
computer system 301 may implement a web browser 318 stored program component. The web browser may be a hypertext viewing application, such as Microsoft Internet Explorer, Google Chrome, Mozilla Firefox, Apple Safari, etc. Secure web browsing may be provided using HTTPS (secure hypertext transport protocol), secure sockets layer (SSL), Transport Layer Security (TLS), etc. Web browsers may utilize facilities such as AJAX, DHTML, Adobe Flash, JavaScript, Java, application programming interfaces (APIs), etc. In some embodiments, thecomputer system 301 may implement amail server 319 stored program component. The mail server may be an Internet mail server such as Microsoft Exchange, or the like. The mail server may utilize facilities such as ASP, ActiveX, ANSI C++/C#, Microsoft .NET, CGI scripts, Java, JavaScript, PERL, PHP, Python, WebObjects, etc. The mail server may utilize communication protocols such as internet message access protocol (IMAP), messaging application programming interface (MAPI), Microsoft Exchange, post office protocol (POP), simple mail transfer protocol (SMTP), or the like. In some embodiments, thecomputer system 301 may implement a mail client 320 stored program component. The mail client may be a mail viewing application, such as Apple Mail, Microsoft Entourage, Microsoft Outlook, Mozilla Thunderbird, etc. - In some embodiments,
computer system 301 may store user/application data 321, such as the data, variables, records, etc. as described in this disclosure. Such databases may be implemented as fault-tolerant, relational, scalable, secure databases such as Oracle or Sybase. Alternatively, such databases may be implemented using standardized data structures, such as an array, hash, linked list, struct, structured text file (e.g., XML), table, or as object-oriented databases (e.g., using ObjectStore, Poet, Zope, etc.). Such databases may be consolidated or distributed, sometimes among the various computer systems discussed above in this disclosure. It is to be understood that the structure and operation of the any computer or database component may be combined, consolidated, or distributed in any working combination. - The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope of the disclosed embodiments.
- Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term "computer-readable medium" should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
- It is intended that the disclosure and examples be considered as exemplary only, with a true scope of disclosed embodiments being indicated by the following claims.
Claims (13)
- A method for managing data in a distributed storage environment, the method comprising:initializing a first storage pool capable of storing data from one or more sources, the first storage pool being distributed across one or more computing devices;storing data from the one or more sources in the first storage pool;generating one or more memory pools in a second storage pool based on an amount of data to be stored in the first storage pool and one or more parameters associated with the data stored in the first storage pool; andcreating metadata in a first memory pool of the one or more memory pools for the data stored in the first storage pool, the metadata being for use in retrieving the data stored in the first storage pool.
- The method of claim 1, wherein the metadata comprises at least one of a location of the data, access rights associated with the data, a number of copies of the data, status of the data, a retention period of the data, and a location of the nearest copy of the data across the one or more computing devices.
- The method of claim 1 or claim 2, further comprising:performing one or more operations associated with the data stored in the first storage pool; andupdating the metadata in the second storage pool in response to performing the one or more operations associated with the data stored in the first storage pool.
- The method of any of the preceding claims, further comprising;
receiving a request to retrieve a first data from the first storage pool;
fetching a map indicative of the location of the first data in the first storage pool; and
fetching a nearest copy of the first data using the map. - The method of any of the preceding claims, wherein initializing the first storage pool further comprises:initiating one or more services on the one or more computing devices to provide the storage capacity of the first storage pool; and sharing the storage capacity by the one or more computing devices among themselves.
- The method of any of the preceding claims, further comprising generating a cache data store in a second memory pool of the one or more memory pools, the cache data store capable of storing at least a portion of the data, the portion of the data comprising frequently accessed data.
- The method of any of the preceding claims, further comprising:detecting a failure in at least one of the one or more computing devices;circulating the information regarding the failure to the metadata;creating one or more redundant copies of the data corrupted by the failure; andupdating the metadata in response to creating one or more redundant copies of the data.
- The method of any of the preceding claims, wherein the first storage pool comprises a persistent storage and wherein the second storage pool comprises a volatile random access memory.
- The method of any of the preceding claims, further comprising:providing concurrent and exclusive access to multiple users of the first storage pool in real-time using one or more semaphores and a mutual exclusion index.
- The method of any of the preceding claims, further comprising:grouping similar type of data together for faster access.
- The method of any of the preceding claims, wherein the metadata is capable of retrieving the data stored in the first storage pool in real-time.
- A system for managing data in a distributed storage environment, the system comprising:one or more hardware processors;a computer-readable medium storing instructions that, when executed by the one or more hardware processors, cause the one or more hardware processors to perform the method of any of the preceding claims.
- A computer-readable medium comprising instructions for managing data in a distributed storage environment that, when executed by the one or more hardware processors, cause the one or more hardware processors to perform the method of any of claims 1 to 11.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
IN4676CH2014 | 2014-09-24 | ||
US14/542,221 US9807167B2 (en) | 2014-09-24 | 2014-11-14 | System and method for optimally managing heterogeneous data in a distributed storage environment |
Publications (1)
Publication Number | Publication Date |
---|---|
EP3001320A1 true EP3001320A1 (en) | 2016-03-30 |
Family
ID=53719687
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP15177580.6A Withdrawn EP3001320A1 (en) | 2014-09-24 | 2015-07-20 | System and method for optimally managing heterogeneous data in a distributed storage environment |
Country Status (1)
Country | Link |
---|---|
EP (1) | EP3001320A1 (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080313207A1 (en) * | 2007-06-13 | 2008-12-18 | Chad Modad | System and method for collection, retrieval, and distribution of data |
US20090210875A1 (en) * | 2008-02-20 | 2009-08-20 | Bolles Benton R | Method and System for Implementing a Virtual Storage Pool in a Virtual Environment |
US20110138102A1 (en) * | 2009-10-30 | 2011-06-09 | International Business Machines Corporation | Data storage management using a distributed cache scheme |
-
2015
- 2015-07-20 EP EP15177580.6A patent/EP3001320A1/en not_active Withdrawn
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080313207A1 (en) * | 2007-06-13 | 2008-12-18 | Chad Modad | System and method for collection, retrieval, and distribution of data |
US20090210875A1 (en) * | 2008-02-20 | 2009-08-20 | Bolles Benton R | Method and System for Implementing a Virtual Storage Pool in a Virtual Environment |
US20110138102A1 (en) * | 2009-10-30 | 2011-06-09 | International Business Machines Corporation | Data storage management using a distributed cache scheme |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9807167B2 (en) | System and method for optimally managing heterogeneous data in a distributed storage environment | |
US11586374B2 (en) | Index lifecycle management | |
US9921766B2 (en) | Methods and systems for managing memory of a storage drive | |
US20140164487A1 (en) | File saving system and method | |
US11019156B1 (en) | Automatic discovery and registration of service applications for files introduced to a user interface | |
US11327924B2 (en) | Archiving data sets in a volume in a primary storage in a volume image copy of the volume in a secondary storage | |
US11112995B2 (en) | Systems and methods for random to sequential storage mapping | |
US10140444B2 (en) | Methods and systems for dynamically managing access to devices for resolution of an incident ticket | |
US12032453B2 (en) | Systems and methods for backup and restore of container-based persistent volumes | |
US11741256B2 (en) | Open access token for determining that file is accessible to everyone | |
US11669387B2 (en) | Proactive risk reduction for data management | |
US12099886B2 (en) | Techniques for performing clipboard-to-file paste operations | |
US11010476B2 (en) | Security-aware caching of resources | |
US20150134628A1 (en) | End of retention processing using a database manager scheduler | |
JP2019537097A (en) | Tracking I-node access patterns and prefetching I-nodes | |
US11010408B2 (en) | Hydration of a hierarchy of dehydrated files | |
CN114385733A (en) | Method and device for unified creation of data model in ETL process | |
US9582331B2 (en) | System and method for a smart operating system for integrating dynamic case management into a process management platform | |
EP3001320A1 (en) | System and method for optimally managing heterogeneous data in a distributed storage environment | |
US10303572B2 (en) | Methods and systems for improving fault tolerance in storage area network | |
US12001408B2 (en) | Techniques for efficient migration of key-value data | |
EP3349417B1 (en) | System and method for storing and delivering digital content | |
US9288265B2 (en) | Systems and methods for performing memory management in a distributed environment | |
CN113126928A (en) | File moving method and device, electronic equipment and medium | |
US10282243B2 (en) | Performance enhancement for platform data dump collection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
17P | Request for examination filed |
Effective date: 20160705 |
|
RBV | Designated contracting states (corrected) |
Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
17Q | First examination report despatched |
Effective date: 20200403 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION HAS BEEN WITHDRAWN |
|
18W | Application withdrawn |
Effective date: 20210527 |