© 2013 . All rights reserved.

Distribution of values in java

I wanted an easy way to count a number of values and keep the distribution to print it, or to be a basis for a histogram.  I first implemented it with fixed buckets, but later changed it to grow as it get new values.  The size of buckets are based on the power of 2, so each bucket will contain the double of unique values than it’s predecessor (with the exception of the first one).

import java.util.ArrayList;
import java.util.Collections;
import java.util.List;

/**
 * Date: 29.10.13
 * Time: 13:27
 * @Author ©2013 Fredrik Rodland, http://rodland.no
 */
public class CountDistribution {
    private final List<Long> count = Collections.synchronizedList(new ArrayList<Long>());

    private long maxValue = Long.MIN_VALUE;

    public void add(long c) {
        int bin = getBin(c);
        if (maxValue < c) {
            ensureCountCapacity(bin);
            maxValue = c;
        }
        Long newCount = count.get(bin) + 1;
        count.set(bin, newCount);
    }

    private void ensureCountCapacity(int bin) {
        while (count.size() < (bin + 1)) {
            count.add(0L);
        }
    }

    @Override
    public String toString() {
        StringBuilder sb = new StringBuilder();
        for (int bin = 0; bin < count.size(); bin++) {
            long c = count.get(bin);
            String txt = getBinTxt(bin);
            sb.append(txt).append(": ").append(c).append(", ");
        }

        return sb.substring(0, sb.length() - 2);
    }

    protected String getBinTxt(int bin) {
        if (bin == 0) {
            return "0-1";
        }
        return ((int) Math.pow(2, bin)) + "-" + ((int) Math.pow(2, bin + 1) - 1);
    }

    int getBin(long value) {
        if (value <= 1) {
            return 0;
        }
        return (int) (Math.log(value) / Math.log(2));
    }
}

The following method:

public void test_toString() {
        CountDistribution hist = new CountDistribution();

        hist.add(0);
        hist.add(1);
        hist.add(1);

        hist.add(2);
        hist.add(2);
        hist.add(2);
        hist.add(3);
        hist.add(3);

        hist.add(4);

        hist.add(33);

        hist.add(100);
        hist.add(123);

        System.out.println(hist.toString());
    }

Will produce the following output:

0-1: 3, 2-3: 5, 4-7: 1, 8-15: 0, 16-31: 0, 32-63: 1, 64-127: 2

Sourcecode

CountDistribution.java
CountDistributionTest.java

If you have any comments, bugs or suggestions on a better implementation, please feel free to comment below.

Leave a Reply

Your email address will not be published.
Required fields are marked:*

*