Linux NTP Server Date Time Force Sync/Update

Your ads will be inserted here by

Easy Plugin for AdSense.

Please go to the plugin admin page to
Paste your ad code OR
Suppress this ad slot.

You may also notice Ntp can’t update the time after a certain amount of time, even after restarting the service. How can we force it to sync the time?

On Centos/Linux:

open file /etc/sysconfig/ntp

and change
OPTIONS=”-u ntp:ntp -p /var/run/ntpd.pid”
to be
OPTIONS=”-x -u ntp:ntp -p /var/run/ntpd.pid”

That -x is a very tiny change, but a huge effect. What this does is when you stop/start ntp (or it starts on a reboot of your system), it does the equivelent of
ntpdate -u time.server.of.choice
ie, forcing the manual update against your chosen time server. No more manually fixing drift that has gotten too wide. From a reboot the time is set to a value that ntp then can automatically update and keep updated moving forward.

Try running
service ntpd restart
and you’ll see it do the manual time update.

Reference: http://tech.darke.net/2011/12/19/ntp-force-update/

Read More

How to increment a Map value in Java

Your ads will be inserted here by

Easy Plugin for AdSense.

Please go to the plugin admin page to
Paste your ad code OR
Suppress this ad slot.

Reference to http://stackoverflow.com/questions/81346/most-efficient-way-to-increment-a-map-value-in-java

The guy from stackoverflow has tested several methods:

  • the “ContainsKey” method that I presented in the question
  • the “TestForNull” method suggested by Aleksandar Dimitrov
  • the “AtomicLong” method suggested by Hank Gay
  • the “Trove” method suggested by jrudolph
  • the “MutableInt” method suggested by phax.myopenid.com

Method

Here’s what I did…

  1. created five classes that were identical except for the differences shown below. Each class had to perform an operation typical of the scenario I presented: opening a 10MB file and reading it in, then performing a frequency count of all the word tokens in the file. Since this took an average of only 3 seconds, I had it perform the frequency count (not the I/O) 10 times.
  2. timed the loop of 10 iterations but not the I/O operation and recorded the total time taken (in clock seconds) essentially using Ian Darwin’s method in the Java Cookbook.
  3. performed all five tests in series, and then did this another three times.
  4. averaged the four results for each method.

Results

I’ll present the results first and the code below for those who are interested.

The ContainsKey method was, as expected, the slowest, so I’ll give the speed of each method in comparison to the speed of that method.

  • ContainsKey: 30.654 seconds (baseline)
  • AtomicLong: 29.780 seconds (1.03 times as fast)
  • TestForNull: 28.804 seconds (1.06 times as fast)
  • Trove: 26.313 seconds (1.16 times as fast)
  • MutableInt: 25.747 seconds (1.19 times as fast)

Conclusions

It would appear that only the MutableInt method and the Trove method are significantly faster, in that only they give a performance boost of more than 10%. However, if threading is an issue, AtomicLong might be more attractive than the others (I’m not really sure). I also ran TestForNull with final variables, but the difference was negligible.

Note that I haven’t profiled memory usage in the different scenarios. I’d be happy to hear from anybody who has good insights into how the MutableInt and Trove methods would be likely to affect memory usage.

Your ads will be inserted here by

Easy Plugin for AdSense.

Please go to the plugin admin page to
Paste your ad code OR
Suppress this ad slot.

Personally, I find the MutableInt method the most attractive, since it doesn’t require loading any third-party classes. So unless I discover problems with it, that’s the way I’m most likely to go.

The code

Here is the crucial code from each method.

ContainsKey

<code>import java.util.HashMap; import java.util.Map; ... Map&lt;String, Integer&gt; freq = new HashMap&lt;String, Integer&gt;(); ... int count = freq.containsKey(word) ? freq.get(word) : 0; freq.put(word, count + 1);</code>

TestForNull

<code>import java.util.HashMap; import java.util.Map; ... Map&lt;String, Integer&gt; freq = new HashMap&lt;String, Integer&gt;(); ... Integer count = freq.get(word); if (count == null) { freq.put(word, 1); } else { freq.put(word, count + 1); }</code>

AtomicLong

<code>import java.util.concurrent.ConcurrentHashMap; import java.util.concurrent.ConcurrentMap; import java.util.concurrent.atomic.AtomicLong; ... final ConcurrentMap&lt;String, AtomicLong&gt; map = new ConcurrentHashMap&lt;String, AtomicLong&gt;(); ... map.putIfAbsent(word, new AtomicLong(0)); map.get(word).incrementAndGet();</code>

Trove

<code>import gnu.trove.TObjectIntHashMap; ... TObjectIntHashMap&lt;String&gt; freq = new TObjectIntHashMap&lt;String&gt;(); ... freq.adjustOrPutValue(word, 1, 1);</code>

MutableInt

<code>import java.util.HashMap; import java.util.Map; ... class MutableInt { int value = 1; // note that we start at 1 since we're counting public void increment () { ++value; } public int get () { return value; } } ... Map&lt;String, MutableInt&gt; freq = new HashMap&lt;String, MutableInt&gt;(); ... MutableInt count = freq.get(word); if (count == null) { freq.put(word, new MutableInt()); } else { count.increment(); }</code>

Read More

Hbase Data Export, Import and Migrate

CopyTable

CopyTable is a utility that can copy part or of all of a table, either to the same cluster or another cluster. The usage is as follows:

$ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable [--starttime=X] [--endtime=Y] [--new.name=NEW] [--peer.adr=ADR] tablename

Options:

  • starttime Beginning of the time range. Without endtime means starttime to forever.
  • endtime End of the time range. Without endtime means starttime to forever.
  • versions Number of cell versions to copy.
  • new.name New table’s name.
  • peer.adr Address of the peer cluster given in the format hbase.zookeeper.quorum:hbase.zookeeper.client.port:zookeeper.znode.parent
  • families Comma-separated list of ColumnFamilies to copy.
  • all.cells Also copy delete markers and uncollected deleted cells (advanced option).

Args:

  • tablename Name of table to copy.

 

Example of copying ‘TestTable’ to a cluster that uses replication for a 1 hour window:

$ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable
--starttime=1265875194289 --endtime=1265878794289
--peer.adr=server1,server2,server3:2181:/hbase TestTable

Export

Export is a utility that will dump the contents of table to HDFS in a sequence file. Invoke via:

$ bin/hbase org.apache.hadoop.hbase.mapreduce.Export &lt;tablename&gt; &lt;outputdir&gt; [&lt;versions&gt; [&lt;starttime&gt; [&lt;endtime&gt;]]]

 

Note: caching for the input Scan is configured via hbase.client.scanner.caching in the job configuration.

Import

Import is a utility that will load data that has been exported back into HBase. Invoke via:

$ bin/hbase org.apache.hadoop.hbase.mapreduce.Import &lt;tablename&gt; &lt;inputdir&gt;

Please refer to <a href="http://hbase.apache.org/book/ops.backup.html">http://hbase.apache.org/book/ops.backup.html</a> for more details

Read More