|CONCISE: COmpressed 'N' Composable Integer SEt
Bit arrays, or bitmaps, are used to significantly speed up set operations
in several areas, such as data warehousing, information retrieval, and data
mining, to cite a few. However, bitmaps usually use a large storage space, thus
requiring compression. Here we offer a Java implementation of
CONCISE (COmpressed 'N'
SEt), a bitmap compression algorithm that,
when compared to classical compression schemes, trades some space to allow for fast bitwise operations without
first decompressing bitmaps. CONCISE can be efficiently used to manipulate sets of integral
numbers in lieu of well-known data structures such as arrays, lists, hashtables, and self-balancing binary search
trees. Click here for more details about
The most up-to-date Java code is available at the SourceForge website.
Among the classes contained within the package, the most important are:
A description for the remaining classes can be found here.
an interface that specifies several methods for the manipulation of “int” sets,
such as union, intersection, difference, complement, etc..
- ConciseSet: an implementation of the
IntSet interface internally represented by a
- FastSet: an implementation of the
IntSet interface internally represented by an
uncompressed bitmap. In general, when the integer set is sparse (i.e., the ratio between the set cardinality and the maximum integer is below 0.1),
ConciseSet is always the best choice.
For dense integer sets, FastSet can offer better performances.
an interface that extends java.util.SortedSet<Integer> by specifying several methods for the manipulation of “Integer” sets,
such as union, intersection, difference, complement, power-set, Jaccard
a class that implements the ExtendedSet interface and
backed by any class implementing the IntSet interface (e.g.,
- A document that describes the details of the CONCISE algorithm can be found here.
It also offers a comparative analysis among standard java.util collections.
- A JavaDoc-based description of the package can be found here.
To be sure that class descriptions are up-to-date, it is always better to download the source code from
the SourceForge website and generate a fresh JavaDoc
javadoc tool provided by JDK.
If you have any question/bug report/suggestion, please send me an email at:
To verify the performance of current CONCISE implementation, I adopted a very powerful profiler called YourKit.
YourKit is kindly supporting open source projects with its full-featured Java Profiler.
YourKit, LLC is the creator of innovative and intelligent tools for profiling
Java and .NET applications. Take a look at YourKit's leading software products:
YourKit Java Profiler and
YourKit .NET Profiler.
Visitors (since March 2009):