8000 Hash index documentation was added. · githubcs/orientdb@96b1768 · GitHub
[go: up one dir, main page]

Skip to content

Commit 96b1768

Browse files
Hash index documentation was added.
1 parent aeff00c commit 96b1768

File tree

3 files changed

+60
-2
lines changed

3 files changed

+60
-2
lines changed

core/src/main/java/com/orientechnologies/orient/core/index/hashindex/local/OLocalHashTable.java

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,60 @@
3434
import java.util.Iterator;
3535

3636
/**
37+
* Implementation of hash index which is based on <a href="http://en.wikipedia.org/wiki/Extendible_hashing">extendible hashing
38+
* algorithm</a>. The directory for extindible hashing is implemented in
39+
* {@link com.orientechnologies.orient.core.index.hashindex.local.OHashTableDirectory} class. Directory is not implemented according
40+
* to classic algorithm because of its big memory consumption in case of non-uniform data distribution instead it is implemented
41+
* according too "Multilevel Extendible Hashing Sven Helmer, Thomas Neumann, Guido Moerkotte April 17, 2002". Which has much less
42+
* memory consumption in case of nonuniform data distribution.
43+
*
44+
* Index itself uses so called "muiltilevel schema" when first level contains 256 buckets, when bucket is split it is put at the
45+
* end of other file which represents second level. So if data which are put has distribution close to uniform (this index was
46+
* designed to be use as rid index for DHT storage) buckets split will be preformed in append only manner to speed up index write
47+
* speed.
48+
*
49+
* So hash index bucket itself has following structure:
50+
* <ol>
51+
* <li>Bucket depth - 1 byte.</li>
52+
* <li>Bucket's size - amount of entities (key, value) in one bucket, 4 bytes</li>
53+
* <li>Page indexes of parents of this bucket, page indexes of buckets split of which created current bucket - 64*8 bytes.</li>
54+
* <li>Offsets of entities stored in this bucket relatively to it's beginning. It is array of int values of undefined size.</li>
55+
* <li>Entities itself</li>
56+
* </ol>
57+
*
58+
* So if 1-st and 2-nd fields are clear. We should discuss the last ones.
59+
*
60+
*
61+
* Entities in bucket are sorted by key's hash code so each entity has following storage format in bucket: key's hash code (8
62+
* bytes), key, value. Because entities are stored in sorted order it means that every time when we insert new entity old ones
63+
* should be moved.
64+
*
65+
* There are 2 reasons why it is bad:
66+
* <ol>
67+
* <li>It will generate write ahead log of enormous size.</li>
68+
* <li>The more amount of memory is affected in operation the less speed we will have. In worst case 60 kb of memory should be
69+
* moved.</li>
70+
* </ol>
71+
*
72+
* To avoid disadvantages listed above entries ara appended to the end of bucket, but their offsets are stored at the beginning of
73+
* bucket. Offsets are stored in sorted order (ordered by hash code of entity's key) so we need to move only small amount of memory
74+
* to store entities in sorted order.
75+
*
76+
* About indexes of parents of current bucket. When item is removed from bucket we check space which is needed to store all entities
77+
* of this bucket, it's buddy bucket (bucket which was also created from parent bucket during split) and if space of single bucket
78+
* is enough to save all entities from both buckets we remove these buckets and put all content in parent bucket. That is why we
79+
* need indexes of parents 8000 of current bucket.
80+
*
81+
* Also hash index has special file of one page long which contains information about state of each level of buckets in index. This
82+
* information is stored as array index of which equals to file level. All array item has following structure:
83+
* <ol>
84+
* <li>Is level removed (in case all buckets are empty or level was not created yet) - 1 byte</li>
85+
* <li>File's level id - 8 bytes</li>
86+
* <li>Amount of buckets in given level - 8 bytes.</li>
87+
* <li>Index of page of first removed bucket (not splitted but removed) - 8 bytes</li>
88+
* </ol>
89+
*
90+
*
3791
* @author Andrey Lomakin
3892
* @since 12.03.13
3993
*/

tests/src/test/java/com/orientechnologies/orient/test/internal/index/HashIndexSpeedTest.java

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ public void init() throws Exception {
3131
if (buildDirectory == null)
3232
buildDirectory = ".";
3333

34-
databaseDocumentTx = new ODatabaseDocumentTx("local:" + buildDirectory + "/uniqueHashIndexTest");
34+
databaseDocumentTx = new ODatabaseDocumentTx("plocal:" + buildDirectory + "/uniqueHashIndexTest");
3535
if (databaseDocumentTx.exists()) {
3636
databaseDocumentTx.open("admin", "admin");
3737
databaseDocumentTx.drop();
@@ -46,8 +46,10 @@ public void init() throws Exception {
4646
@Override
4747
@Test(enabled = false)
4848
public void cycle() throws Exception {
49+
databaseDocumentTx.begin();
4950
String key = "bsadfasfas" + random.nextInt();
5051
hashIndex.put(key, new ORecordId(0, new OClusterPositionLong(0)));
52+
databaseDocumentTx.commit();
5153
}
5254

5355
@Override

tests/src/test/java/com/orientechnologies/orient/test/internal/index/SBTreeInsertionSpeedTest.java

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ public void init() throws Exception {
3434
if (buildDirectory == null)
3535
buildDirectory = ".";
3636

37-
databaseDocumentTx = new ODatabaseDocumentTx("local:" + buildDirectory + "/SBTreeInsertionSpeedTTest");
37+
databaseDocumentTx = new ODatabaseDocumentTx("plocal:" + buildDirectory + "/SBTreeInsertionSpeedTTest");
3838
if (databaseDocumentTx.exists()) {
3939
databaseDocumentTx.open("admin", "admin");
4040
databaseDocumentTx.drop();
@@ -49,8 +49,10 @@ public void init() throws Exception {
4949
@Override
5050
@Test(enabled = false)
5151
public void cycle() throws Exception {
52+
databaseDocumentTx.begin();
5253
String key = "bsadfasfas" + random.nextInt();
5354
index.put(key, new ORecordId(0, new OClusterPositionLong(0)));
55+
databaseDocumentTx.commit();
5456
}
5557

5658
@Override

0 commit comments

Comments
 (0)
0