CACSS Cloud Storage System Web Login Portal

CACSS Cloud Storage System (Big Data Ready) Live Demonstration (Amazon S3, Rackspace, Google Cloud Storage Alternatives)

Your ads will be inserted here by

Easy Plugin for AdSense.

Please go to the plugin admin page to
Paste your ad code OR
Suppress this ad slot.

CACSS Cloud Storage System

  • Neither of Amazon S3’s architecture nor its implementation has yet been made public. As such, it is not available for extension in order to develop the capability of creating private clouds of any size. In order to reveal this secret knowledge behind cloud storage services and thereby a generic solution, we present CACSS, a generic computational and adaptive cloud storage system that adapts existing storage technologies to provide efficient and scalable services.
  • CACSS Cloud Storage System is built as the implementation of one of my papers CACSS: Towards a Generic Cloud Storage Service (In: The 2nd International Conference on Cloud Computing and Services Science, CLOSER 2012, Porto, Portugal. Pages 27-36,SciTePress 2012, ISBN 978-989-8565-05-1.)
  • It is Amazon S3 API Compatible (you only need to change the endpoint from to
  • Currently deployed across a bunch of virtual machines.
  • Some functions such as object versioning is currently disabled.
  • Please feel free to contact me for any questions you may have.

Web Portal: or

End Point: (use port 80, SSL is not yet supported)

CACSS Open Source Cloud Storage System Web Login Portal

CACSS OpenSource Cloud Storage System Web Control Panel Demostration

Keywords: Amazon S3, Amazon S3 Architecture, Open Source Amazon S3,  Open Source Cloud Storage System, CACSS, Big Data Storage


Read More

Hbase Data Export, Import and Migrate

Your ads will be inserted here by

Easy Plugin for AdSense.

Please go to the plugin admin page to
Paste your ad code OR
Suppress this ad slot.


CopyTable is a utility that can copy part or of all of a table, either to the same cluster or another cluster. The usage is as follows:

$ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable [--starttime=X] [--endtime=Y] [] [--peer.adr=ADR] tablename


  • starttime Beginning of the time range. Without endtime means starttime to forever.
  • endtime End of the time range. Without endtime means starttime to forever.
  • versions Number of cell versions to copy.
  • New table’s name.
  • peer.adr Address of the peer cluster given in the format hbase.zookeeper.quorum:hbase.zookeeper.client.port:zookeeper.znode.parent
  • families Comma-separated list of ColumnFamilies to copy.
  • all.cells Also copy delete markers and uncollected deleted cells (advanced option).


  • tablename Name of table to copy.


Example of copying ‘TestTable’ to a cluster that uses replication for a 1 hour window:

$ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable
--starttime=1265875194289 --endtime=1265878794289
--peer.adr=server1,server2,server3:2181:/hbase TestTable


Your ads will be inserted here by

Easy Plugin for AdSense.

Please go to the plugin admin page to
Paste your ad code OR
Suppress this ad slot.

Export is a utility that will dump the contents of table to HDFS in a sequence file. Invoke via:

$ bin/hbase org.apache.hadoop.hbase.mapreduce.Export <tablename> <outputdir> [<versions> [<starttime> [<endtime>]]]


Note: caching for the input Scan is configured via hbase.client.scanner.caching in the job configuration.


Import is a utility that will load data that has been exported back into HBase. Invoke via:

$ bin/hbase org.apache.hadoop.hbase.mapreduce.Import <tablename> <inputdir>

Please refer to <a href=""></a> for more details

Read More

HBase does not support row key renaming

As far as I know, in HBase, the key of the row can not be changed. But if you really need a row key rename function, the best thing to do is to copy all the data one row to another row in HBase using JAVA.

e.g. I have existing row with key “key1”, and I want to create a row with key “key2” copied from “key1” row. It is simple:

 // lets say your already got the result from table.get(Bytes.toBytes("key1")) 
Put put = new Put(Bytes.toBytes("key2")); 
NavigableMap&gt; familyQualifierMap = result.getNoVersionMap(); 
for (byte[] familyBytes : familyQualifierMap.keySet()) {
 NavigableMap qualifierMap = familyQualifierMap.get(familyBytes); 
for (byte[] qualifier : qualifierMap.keySet()) {
 put.add(familyBytes, qualifier, qualifierMap.get(qualifier)); } } 

Read More


How to Make/Convert an Animated GIF from a Video online

There are many reasons why someone would want to convert a video into a GIF animation. The main one is so that people can put an animation of a video online without streaming the video itself such as on Twitter,Weibo. but for those not acquainted with the process, it can seem difficult.

I have developed a website for online Video to Animated GIF converting service.

Cloudpapa ( is a powerful online service built in the Cloud Computing environment to convert video file to animated GIF file. You can use this software to convert selected part or convert a single video file. Picture cropping, flipping are supported. You may also use it to make grayscale or sepia GIF file. It supports various popular video formats, such as avi, mp4, wmv, asf, mpg, dat, 3gp, flv, f4v, mov, mkv, rm, rmvb, swf, vob, webm. The interface is very user-friendly and easy-to-use.

Main supported video formats: 3GPP files (*.3gp,*.3g2,*.3gpp;*,3g,*.k3g,*,3gp2); AMV files (*.amv); AVI Files (*.avi); Digital Video Files (*.dv); DVD Video Files (*.vob); Flash Video (*.flv,*.f4v,*.swf); Matroska Video (*.mkv); MPEG Files (*.mpg, *.mpeg , *.mpe, *.m1v, *.m2v, *.m2t, *.tod); MPEG-4 files(*.mp4,*.m4v); NDS DPG Files (*.dpg); QuickTime Files (*.mov,*.qt); RealMedia file (*.rm,*.rmvb); VCD Movie Files (*.dat); WebM Video Files (*.webm); Windows Media Video Files (*.wmv,*.asf).

cloudpapa Video to Gif online converter maker

cloudpapa Video to Gif online converter maker

Cloudpapa Gif example leon snowboarding

Cloudpapa Gif example leon snowboarding

Read More

Dynamic and high performance Cloud Storage system

In a 2008 study at the University of California, Santa Cruz funded by the National Science Foundation shows that more than 60% of data is inactive or static, but kept just in case it’s needed. Storing the inactive and active data in the same place is not cost-effective. Same problem also exist among cloud storage providers

  • Dynamic file storage placement: To balance the performance with costs, there is need for cloud storage providers to adapt multi-tier storage systems and enable automatically moving inactive data and storing them in lower cost storage and retrieve it immediately. The data movement is accomplished in a manner that is transparent to the end users.
    • File Access Pattern Recognition
    • File Types: sequential access, random access
    • Performance on-demand service: to enable users to reserve the storage and network I/O for specific files at a specific time in advance. This guarantees the performance of these files. Both users and providers benefit from this service. Users can pay and get better performance on demand without doing much effort.  Providers can better schedule the resource based on the reservations.
      • Storage system and network I/O scheduling
      • Spot pricing, dynamic price over time


Existing Product and research:

Hitachi: Tiered Storage Manage:

Hystor: Making the Best Use of Solid State Drives in High Performance Storage Systems.

Energy Efficient Storage Management Cooperated with Large Data Intensive Applications

HybridStore: A Cost-Efficent, High performance Storage System Combining SSDs and HDDs

Cost Effective Storage using Extent based Dynamic Tiering

Read More

storage workflow dropbox

Future Collaboration: Dropbox+Workflow+Social Network concept

How do we collaboration now?

—  Email attachments: The most common way. Difficult to complete complex workflow tasks and there is no data synchronization

—  Microsoft SharePoint: static workflow on documents. It doesn’t fit for the open cloud environment, difficult to create temporary groups and share data.

—  Google Docs: online editing documents at the same time. It provides basic permission controls that who can read/write the documents

—  Huddle: assign single tasks to users on a specific document, carried out by start and end dates.


SharePoint (Microsoft)

—  Document collaboration means several authors work on a document or collection of documents together. They could be simultaneously co-authoring a document or reviewing a specification as part of a structured workflow.

—  Semiformal co-authoring: Multiple authors edit simultaneously anywhere in the document.

—  Formal co-authoring: Multiple authors edit simultaneously in a controlled way by saving content when ready to be revealed. Examples include: business plans, newsletters, and legal briefs for Word; and marketing and conference presentations for PowerPoint.

—  Comment and review: A primary author solicits edits and comments (which can be threaded discussions) by routing the document in a workflow, but controls final document publishing. Examples include online help, white papers, and specifications.

—  Document sets: Multiple authors are assigned separate documents as part of a workflow, and then one master document set is published. Examples include: new product literature and a sales pitch book.


Typical story:

  1. A teacher sends homework to all the students after the class.
  2. All students receive the homework file.
  3. Students complete the homework.
  4. The homework file gets submitted to the not-graded folder.
  5. The teacher gets a message and starts marking it or sends the homework to teaching helpers.
  6. The homework gets marked and sent back to the student to re-do of incorrect sections.
  7. Student updates the doc accordingly and complete it
  8. The teacher grades the doc again and complete.

Open State Scenario

For example, in an open research environment, researchers need to circulate documents to other appropriate persons when needed. Often, there is some legal constrains such as where the data can be accessed and how can these documents to be circulated. For example, in such a project, data cannot leak beyond the designated region.

Read More

Nature of Cloud Computing Part 1

“Cloud Computing” is becoming one of the hottest buzzwords in the IT industry. It has been brewed over the last few years, now it is publicly recognized as the fifth paradigm shifts in computing. Some leading IT companies such as Amazon, IBM, Microsoft and Google begin to develop the cloud computing concepts with enormous investment. Also, in many countries, governments have launched many schemes to catalyse the development of this new technology. In this paper, fundamental concepts of cloud computing, utility computing and existing price models of major cloud providers are described.

Concepts of Cloud Computing

Cloud computing was first visualized by a Stanford professor John McCarthy, who predicted in 1961 that “Computation may someday be organized as a public utility”.

Even by now, cloud computing is still an evolving concept. For this reason the definition of cloud computing varies widely. From technical aspect, “Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.”

Utility Computing

Utility computing is the packaging of computing resources, such as computation and storage, as a metered service similar to a traditional public utility (such as electricity, water, natural gas, or telephone network) .Cloud computing and utility computing are similar. Users are able to request and use a specific amount of resource on demand as through services.


NIST Definition of Cloud Computing v15 ,

Decision support and business intelligence 8th edition page 680

Read More