Hadoop de Doo

I have found that virtualization technologies like VMWare and Xen have added real value to companies in many ways. We use Vmware for our business and our customers After reading about the IBM and Google cloud computing, If found that it uses Xen and Hadoop. Hadoop?? I had not heard of Hadoop, but I had previously read the technical papers on what is based on , which is the Google File System. Hadoop is the open source implemenation of the Google File System. I started thinking about how I would use this and ..should I use it?

We have used Sun Grid Engine at a number of our clients and although we often times use a Netapp or Storevault Filer for the centralized application and project files, I don't like to use this valuable resource for transient data/simulation files. EDA applications like Cadence Spectre and Agilent ADS generate data files for simulation output which can be very large. So, if you were to launch a simulation through the Grid, the least loaded machine in the cluster may take the job and may be configured to save the transient data to a local machine. This may not be the machine you normally work on. Past experience has shown that Linux NFS file servers cannot handle high throughput loads.

Now there is hadoop? It should be possible to create a Hadoop Distributed file server that handles the data of the cluster by utilizing the distributed network of disks. This is worth investigating and prototyping. Stay tuned and I will let you know how it performs.

Hadoop de Doo

Subscribe Here For Our Blogs:

Recent Posts

Categories