<?xml version="1.0" encoding="UTF-8"?> <rss
version="2.0"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:wfw="http://wellformedweb.org/CommentAPI/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
><channel><title>Werx Limited &#187; distributed computing</title> <atom:link href="http://werxltd.com/wp/tag/distributed-computing/feed/" rel="self" type="application/rss+xml" /><link>http://werxltd.com/wp</link> <description>We make IT work.</description> <lastBuildDate>Mon, 23 Jan 2012 23:03:59 +0000</lastBuildDate> <language>en</language> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <generator>http://wordpress.org/?v=3.3.1</generator> <item><title>Diskless computing vs distributed computing</title><link>http://werxltd.com/wp/2009/09/03/diskless-computing-vs-distributed-computing/</link> <comments>http://werxltd.com/wp/2009/09/03/diskless-computing-vs-distributed-computing/#comments</comments> <pubDate>Thu, 03 Sep 2009 14:56:35 +0000</pubDate> <dc:creator>wes</dc:creator> <category><![CDATA[general]]></category> <category><![CDATA[it industry]]></category> <category><![CDATA[cloud computing]]></category> <category><![CDATA[cluster computing]]></category> <category><![CDATA[diskless computing]]></category> <category><![CDATA[distributed computing]]></category> <category><![CDATA[headless computing]]></category> <category><![CDATA[network administration]]></category> <category><![CDATA[parallel processing]]></category> <category><![CDATA[seti@home]]></category> <category><![CDATA[terminal server]]></category><guid
isPermaLink="false">http://werxltd.com/wp/?p=197</guid> <description><![CDATA[A friend of mine recently asked me about cloud computing, what it was, and the ramifications of it on where we will see technology in the coming years. In his question he demonstrated a common confusion among most people between the difference between cloud computing and diskless computing. Both of these are interesting areas of computer [...]]]></description> <content:encoded><![CDATA[<p>A friend of mine recently asked me about cloud computing, what it was, and the ramifications of it on where we will see technology in the coming years. In his question he demonstrated a common confusion among most people between the difference between cloud computing and diskless computing.</p><p>Both of these are interesting areas of computer science, they do sometimes overlap, and they are both going to change computing in general in significant ways as time rolls on, but they are not the same.</p><p>Here&#8217;s are the differences to help  you can tell them apart.</p><h3>Diskless computing</h3><p><a
href="http://en.wikipedia.org/wiki/Diskless_node">Diskless computing</a> is best demonstrated in the <a
href="http://ltsp.org/">Linux Terminal Server Project</a> (excellent project, I&#8217;ve use it before to deploy over 150 diskless workstations in a company before) and Microsoft&#8217;s pathetic rival, <a
href="http://www.microsoft.com/windowsserver2003/technologies/terminalservices/default.mspx">Windows Terminal Services</a>. Sun has their <a
href="http://www.sun.com/desktop/sun-ray-clients.jsp">own solution</a> as well and there are countless 3rd party utilities, but the basic idea behind them all is that you have one big computer (or series of computers) that all these &#8220;headless&#8221; computers connect to in order to retrieve an operating system, store files, etc. For large networks this network model is absolutely amazing.</p><h3>Cloud Computing</h3><p><a
href="http://en.wikipedia.org/wiki/Cloud_computing">Cloud computing</a>, however, is the concept that you have a large problem that requires a lot of computing power to solve. Rather than buy bigger and bigger hardware, what we&#8217;ve found out (going back to <a
href="http://www.cray.com/Home.aspx">Cray supercomputers</a>) is that it is far better to split the problem down into iterative chunks and push those through multiple processors all at once rather than try to get a single processor to process everything. This is called <a
href="http://en.wikipedia.org/wiki/Distributed_computing">distributed computing</a>.</p><p>You might have heard of one of the major platforms for this type of computing, <a
href="http://www.beowulf.org/">Beowulf</a>, from the popular <a
href="http://en.wikipedia.org/wiki/Internet_meme">internet meme</a> &#8220;imagine a beowulf cluster of&#8230;&#8221; Another very popular distributed computing platform (popular because it is far easier to install, operate, and write code for than the Beowulf project) is <a
href="http://werxltd.com/wp/2009/08/26/getting-starte-with-hadoop-and-mapreduce/">Hadoop</a>. Hadoop is a project inspired by Google&#8217;s implementation of the MapReduce design paradigm written in Java which makes it a lot more portable.</p><h3>Projects using Cloud Computing</h3><p>Parallel processing is done today in a wide variety of settings including:</p><ul><li>3D rendering farms for companies such as Disney&#8217;s Pixar</li><li>indexing the web with Google, Yahoo, Microsoft, etc.</li><li><a
href="http://werxltd.com/wp/2009/08/31/an-introduction-to-statistics-and-data-mining/">data mining</a> of all sorts with companies like Wal-Mart, etc.</li></ul><h3>Join in!</h3><p>There are some very popular projects using distributed computing technologies that regular people with CPU cycles to spare are encouraged to join in on like:</p><ul><li><a
href="http://setiathome.ssl.berkeley.edu/">SETI@home</a> where you can help process data that might help us identify extraterrestrial signals</li><li><a
href="http://folding.stanford.edu/">Folding@home</a> where you can help search for cures to various diseases</li><li><a
href="http://genomeathome.stanford.edu/">Genome@home</a> where you can help map the human genome (again), this is tied closely to the folding@home project above</li><li><a
href="http://www.boingboing.net/2004/05/26/shrekhome-bluesky-pr.html">Shrek@home</a> which was a pioneer project that a few of us got to participate in</li><li><a
href="http://www.friedbeef.com/9-world-changing-projects-that-your-computer-can-participate-in/">others</a>, including <a
href="http://fightaidsathome.scripps.edu/">fightaids@home</a> to help fight AIDS and <a
href="http://lhcathome.cern.ch/">lhc@home</a> to process the massive amounts of data coming from the <a
href="http://en.wikipedia.org/wiki/Large_Hadron_Collider">CERN&#8217;s Large Hadron Collider</a></li></ul><p>So while diskless computing and cloud computing can have some areas of overlap (I configured the LTSP network I mentioned earlier to assist with the genome@home project when the systems were idle) they aren&#8217;t necessarily tied together.</p><div
class="betterrelated none"><p>No related content found.</p></div><p><!--[if IE]><iframe
frameborder="0" allowTransparency="true" class="addtoany_special_service facebook_like" src="http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F09%2F03%2Fdiskless-computing-vs-distributed-computing%2F&amp;layout=button_count&amp;show_faces=false&amp;width=75&amp;action=like&amp;colorscheme=light&amp;height=20&amp;ref=addtoany" scrolling="no" style="border:none;overflow:hidden;width:90px;height:21px"></iframe><![endif]--><!--[if !IE]><!--><iframe
class="addtoany_special_service facebook_like" src="http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F09%2F03%2Fdiskless-computing-vs-distributed-computing%2F&amp;layout=button_count&amp;show_faces=false&amp;width=75&amp;action=like&amp;colorscheme=light&amp;height=20&amp;ref=addtoany" scrolling="no" style="border:none;overflow:hidden;width:90px;height:21px"></iframe><!--<![endif]--><!--[if IE]><iframe
frameborder="0" allowTransparency="true" class="addtoany_special_service twitter_tweet" src="http://platform.twitter.com/widgets/tweet_button.html?url=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F09%2F03%2Fdiskless-computing-vs-distributed-computing%2F&amp;counturl=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F09%2F03%2Fdiskless-computing-vs-distributed-computing%2F&amp;count=none&amp;text=Diskless%20computing%20vs%20distributed%20computing" scrolling="no" style="border:none;overflow:hidden;width:55px;height:20px"></iframe><![endif]--><!--[if !IE]><!--><iframe
class="addtoany_special_service twitter_tweet" src="http://platform.twitter.com/widgets/tweet_button.html?url=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F09%2F03%2Fdiskless-computing-vs-distributed-computing%2F&amp;counturl=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F09%2F03%2Fdiskless-computing-vs-distributed-computing%2F&amp;count=none&amp;text=Diskless%20computing%20vs%20distributed%20computing" scrolling="no" style="border:none;overflow:hidden;width:55px;height:20px"></iframe><!--<![endif]--><!--[if IE]><iframe
frameborder="0" allowTransparency="true" class="addtoany_special_service google_plusone" src="https://plusone.google.com/u/0/_/%2B1/fastbutton?url=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F09%2F03%2Fdiskless-computing-vs-distributed-computing%2F&amp;size=medium&amp;count=false" scrolling="no" style="border:none;overflow:hidden;width:32px;height:20px"></iframe><![endif]--><!--[if !IE]><!--><iframe
class="addtoany_special_service google_plusone" src="https://plusone.google.com/u/0/_/%2B1/fastbutton?url=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F09%2F03%2Fdiskless-computing-vs-distributed-computing%2F&amp;size=medium&amp;count=false" scrolling="no" style="border:none;overflow:hidden;width:32px;height:20px"></iframe><!--<![endif]--><a
class="a2a_button_linkedin" href="http://www.addtoany.com/add_to/linkedin?linkurl=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F09%2F03%2Fdiskless-computing-vs-distributed-computing%2F&amp;linkname=Diskless%20computing%20vs%20distributed%20computing" title="LinkedIn" rel="nofollow" target="_blank"><img
src="http://werxltd.com/wp/wp-content/plugins/add-to-any/icons/linkedin.png?9d7bd4" width="16" height="16" alt="LinkedIn"/></a><a
class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F09%2F03%2Fdiskless-computing-vs-distributed-computing%2F&amp;title=Diskless%20computing%20vs%20distributed%20computing" id="wpa2a_2">Share/Save</a></p>]]></content:encoded> <wfw:commentRss>http://werxltd.com/wp/2009/09/03/diskless-computing-vs-distributed-computing/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>An introduction to statistics and data mining</title><link>http://werxltd.com/wp/2009/08/31/an-introduction-to-statistics-and-data-mining/</link> <comments>http://werxltd.com/wp/2009/08/31/an-introduction-to-statistics-and-data-mining/#comments</comments> <pubDate>Mon, 31 Aug 2009 19:39:17 +0000</pubDate> <dc:creator>wes</dc:creator> <category><![CDATA[general]]></category> <category><![CDATA[it industry]]></category> <category><![CDATA[data mining]]></category> <category><![CDATA[data visualization]]></category> <category><![CDATA[distributed computing]]></category> <category><![CDATA[statistics]]></category><guid
isPermaLink="false">http://werxltd.com/wp/?p=187</guid> <description><![CDATA[Following my recent post on Hadoop and MapReduce, I want to share a few helpful resources I&#8217;ve found in the areas of data mining and statistical analysis. I&#8217;ll look into helpful ways of visualizing data later on (including new/improved helpful charting libraries from Google), however this post will deal almost exclusively with the question of [...]]]></description> <content:encoded><![CDATA[<p>Following my recent post on <a
href="http://werxltd.com/wp/2009/08/26/getting-starte-with-hadoop-and-mapreduce/">Hadoop and MapReduce</a>, I want to share a few helpful resources I&#8217;ve found in the areas of data mining and statistical analysis. I&#8217;ll look into helpful ways of visualizing data later on (including new/improved helpful <a
href="http://code.google.com/apis/chart/">charting libraries from Google</a>), however this post will deal almost exclusively with the question of how to go about understanding and acquiring helpful sets of data.</p><h3>Introduction</h3><p><a
href="http://video.google.com/videoplay?docid=-7252045691453600738&amp;ei=HgqcSuGsAojMqgKMj7S9BA&amp;hl=en">Here is a fairly helpful broad introduction</a> to data mining and it&#8217;s applications.</p><h3>Crash course</h3><p>The best introduction to these subjects I&#8217;ve found are a series of <a
href="http://www.stats202.com">&#8220;Stats 202&#8243;</a> videos done by Stanford professor <a
href="http://www.davemease.com/">David Mease</a>:<br
/> <a
href="http://www.stats202.com/original_index.html">Statistical Aspects of Data Mining (Stats 202)</a>: <a
href="http://www.youtube.com/watch?v=zRsMEl6PHhM">Lecture 1</a>, <a
href="http://www.youtube.com/watch?v=YFC2KUmEebc">Lecture 2</a>, <a
href="http://www.youtube.com/watch?v=1HAAF4UT75o">Lecture 3</a>, <a
href="http://www.youtube.com/watch?v=qBcI9WakS2o">Lecture 4</a>, <a
href="http://www.youtube.com/watch?v=iXCPJNT9ZOQ">Lecture 5</a>, <a
href="http://www.youtube.com/watch?v=XzxGnF_eiNo">Lecture 6</a>, <a
href="http://www.youtube.com/watch?v=FoKxzorQIhU">Lecture 7</a>, <a
href="http://www.youtube.com/watch?v=N5i85v0ckzY">Lecture 8</a>, <a
href="http://www.youtube.com/watch?v=xpuB9ydmBsM">Lecture 9</a>, <a
href="http://www.youtube.com/watch?v=CzvgrcQhWGg">Lecture 10</a>, <a
href="http://www.youtube.com/watch?v=l4a3e__QzoY">Lecture 11</a>, <a
href="http://www.youtube.com/watch?v=fmZYH3rmqDQ">Lecture 12</a>, <a
href="http://www.youtube.com/watch?v=-tWS0tN8sW0">Lecture 13</a></p><h3>Tools</h3><p>It may surprise you to find this out, but the easiest and fastest tools to use when starting out are generally spreadsheet applications like <a
href="http://office.microsoft.com/en-us/excel/default.aspx">Microsoft Excel</a> and <a
href="http://www.openoffice.org/product/calc.html">OpenOffice&#8217;s Calc</a> which will help you quickly import and visualize your data.</p><p>However, another popular tool for statistics and data mining is the <a
href="http://www.r-project.org/">R Project for Statistical Computing</a> which is free and has binaries for Windows, Mac, and Linux. R also includes a helpful &#8220;sample&#8221; function to help you extract meaningful results from a subset of your data without having to process it all at once.</p><p>Know of any other helpful sites or statistical tools? Post them below!</p><p>Helpful hint regarding videos: If you are like me and prefer to watch/listen to long lectures in your car or otherwise on the go on your netbook, iPod or other mobile device.  Try looking for the above mentioned videos on Google Video instead of YouTube. Google Video includes a helpful download link that allows you to take a copy of the movie with you.</p><div
class="betterrelated none"><p>No related content found.</p></div><p><!--[if IE]><iframe
frameborder="0" allowTransparency="true" class="addtoany_special_service facebook_like" src="http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F08%2F31%2Fan-introduction-to-statistics-and-data-mining%2F&amp;layout=button_count&amp;show_faces=false&amp;width=75&amp;action=like&amp;colorscheme=light&amp;height=20&amp;ref=addtoany" scrolling="no" style="border:none;overflow:hidden;width:90px;height:21px"></iframe><![endif]--><!--[if !IE]><!--><iframe
class="addtoany_special_service facebook_like" src="http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F08%2F31%2Fan-introduction-to-statistics-and-data-mining%2F&amp;layout=button_count&amp;show_faces=false&amp;width=75&amp;action=like&amp;colorscheme=light&amp;height=20&amp;ref=addtoany" scrolling="no" style="border:none;overflow:hidden;width:90px;height:21px"></iframe><!--<![endif]--><!--[if IE]><iframe
frameborder="0" allowTransparency="true" class="addtoany_special_service twitter_tweet" src="http://platform.twitter.com/widgets/tweet_button.html?url=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F08%2F31%2Fan-introduction-to-statistics-and-data-mining%2F&amp;counturl=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F08%2F31%2Fan-introduction-to-statistics-and-data-mining%2F&amp;count=none&amp;text=An%20introduction%20to%20statistics%20and%20data%20mining" scrolling="no" style="border:none;overflow:hidden;width:55px;height:20px"></iframe><![endif]--><!--[if !IE]><!--><iframe
class="addtoany_special_service twitter_tweet" src="http://platform.twitter.com/widgets/tweet_button.html?url=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F08%2F31%2Fan-introduction-to-statistics-and-data-mining%2F&amp;counturl=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F08%2F31%2Fan-introduction-to-statistics-and-data-mining%2F&amp;count=none&amp;text=An%20introduction%20to%20statistics%20and%20data%20mining" scrolling="no" style="border:none;overflow:hidden;width:55px;height:20px"></iframe><!--<![endif]--><!--[if IE]><iframe
frameborder="0" allowTransparency="true" class="addtoany_special_service google_plusone" src="https://plusone.google.com/u/0/_/%2B1/fastbutton?url=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F08%2F31%2Fan-introduction-to-statistics-and-data-mining%2F&amp;size=medium&amp;count=false" scrolling="no" style="border:none;overflow:hidden;width:32px;height:20px"></iframe><![endif]--><!--[if !IE]><!--><iframe
class="addtoany_special_service google_plusone" src="https://plusone.google.com/u/0/_/%2B1/fastbutton?url=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F08%2F31%2Fan-introduction-to-statistics-and-data-mining%2F&amp;size=medium&amp;count=false" scrolling="no" style="border:none;overflow:hidden;width:32px;height:20px"></iframe><!--<![endif]--><a
class="a2a_button_linkedin" href="http://www.addtoany.com/add_to/linkedin?linkurl=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F08%2F31%2Fan-introduction-to-statistics-and-data-mining%2F&amp;linkname=An%20introduction%20to%20statistics%20and%20data%20mining" title="LinkedIn" rel="nofollow" target="_blank"><img
src="http://werxltd.com/wp/wp-content/plugins/add-to-any/icons/linkedin.png?9d7bd4" width="16" height="16" alt="LinkedIn"/></a><a
class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F08%2F31%2Fan-introduction-to-statistics-and-data-mining%2F&amp;title=An%20introduction%20to%20statistics%20and%20data%20mining" id="wpa2a_4">Share/Save</a></p>]]></content:encoded> <wfw:commentRss>http://werxltd.com/wp/2009/08/31/an-introduction-to-statistics-and-data-mining/feed/</wfw:commentRss> <slash:comments>3</slash:comments> </item> <item><title>Getting started with Hadoop and MapReduce</title><link>http://werxltd.com/wp/2009/08/26/getting-starte-with-hadoop-and-mapreduce/</link> <comments>http://werxltd.com/wp/2009/08/26/getting-starte-with-hadoop-and-mapreduce/#comments</comments> <pubDate>Wed, 26 Aug 2009 18:12:58 +0000</pubDate> <dc:creator>wes</dc:creator> <category><![CDATA[it industry]]></category> <category><![CDATA[software development]]></category> <category><![CDATA[cloud computing]]></category> <category><![CDATA[distributed computing]]></category> <category><![CDATA[hadoop]]></category> <category><![CDATA[java]]></category> <category><![CDATA[mapreduce]]></category> <category><![CDATA[pyhon]]></category> <category><![CDATA[python]]></category><guid
isPermaLink="false">http://werxltd.com/wp/?p=175</guid> <description><![CDATA[Recently I&#8217;ve been studying several technologies that appear to form the core of cloud computing. In short, these are the technologies behind such technological marvels as Amazon, Google, Facebook, Yahoo, NetFlix, Pixar, etc.1 Since each of these technologies by themselves is worthy of a new book, and since even those familiar with the common implementation [...]]]></description> <content:encoded><![CDATA[<p>Recently I&#8217;ve been studying several technologies that appear to form the core of cloud computing. In short, these are the technologies behind such technological marvels as <a
href="http://amazon.com">Amazon</a>, <a
href="http://google.com">Google</a>, <a
href="http://facebook.com">Facebook</a>, <a
href="http://yahoo.com">Yahoo</a>, <a
href="http://netflix.com">NetFlix</a>, <a
href="http://www.pixar.com/">Pixar</a>, etc.<sup><a
href="http://werxltd.com/wp/2009/08/26/getting-starte-with-hadoop-and-mapreduce/#footnote_0_175" id="identifier_0_175" class="footnote-link footnote-identifier-link" title="This article is a continuation of a recent article I wrote on the different approaches to cloud computing taken by Google and Microsoft">1</a></sup></p><p>Since each of these technologies by themselves is worthy of a new book, and since even those familiar with the common implementation languages of these technologies (like Java and Python), I decided to put together all the resources I&#8217;ve found on these technologies in hopes that they will help someone else get started in this fascinating world of distributed or &#8220;cloud computing&#8221;.</p><h3>Introduction to cloud computing</h3><p>One might wonder why they should take the time to learn these technologies and concepts. A fair question to ask considering the amount of time and energy that will potentially be required in order to put any of this knowledge to any functional use. With that in mind I found the following videos particularly helpful in answering the question &#8220;why should I care?&#8221;:</p><ul><li>Hadoop, Map Reduce, and Big Data Sets <a
href="http://www.youtube.com/watch?v=CMt-IqQlnQ8">Part 1</a>, <a
href="http://www.youtube.com/watch?v=YtkaDQOuJ4k&amp;feature=related">Part 2</a></li><li><a
href="http://www.youtube.com/watch?v=Aq0x2z69syM">O&#8217;Reilly Webcast: An Introduction to Hadoop</a></li><li><a
href="http://www.youtube.com/watch?v=MXoMWC6xPUw">Computing in the Cloud &#8211; Introduction</a></li></ul><h3>Hadoop</h3><p><a
href="http://hadoop.apache.org/">Hadoop</a><sup><a
href="http://werxltd.com/wp/2009/08/26/getting-starte-with-hadoop-and-mapreduce/#footnote_1_175" id="identifier_1_175" class="footnote-link footnote-identifier-link" title="Hadoop was actually inspired by Google, more history and background&nbsp;here.">2</a></sup> is essentially a compilation of a number of different projects  that make distributed computing a lot less painful. The best source of beginner&#8217;s information on Hadoop I&#8217;ve found has come from these Google lectures as well as from <a
href="http://www.cloudera.com">Cloudera</a>&#8216;s <a
href="http://www.cloudera.com/hadoop-training">training pages</a>:</p><ul><li><a
href="http://www.cloudera.com/hadoop-training-programming-with-hadoop">Programming with Hadoop</a></li></ul><h3>MapReduce</h3><p>MapReduce is more of a paradigm than a language. It is a way to write algorithms that can be run in parallel in order to utilize the computing power of a number of computers across a large data set. There are a number of software frameworks that make writing MapReduce jobs a lot easier and in the following videos you will learn how to use some of the most common.</p><ul><li>Cluster Computing and MapReduce <a
href="http://www.youtube.com/watch?v=yjPBkvYh-ss">Lecture 1</a>, <a
href="http://www.youtube.com/watch?v=-vD6PUdf3Js&amp;feature=channel">Lecture 2</a>, <a
href="http://www.youtube.com/watch?v=5Eib_H_zCEY&amp;feature=related">Lecture 3</a>, <a
href="http://www.youtube.com/watch?v=1ZDybXl212Q&amp;feature=related">Lecture 4</a>, <a
href="http://www.youtube.com/watch?v=BT-piFBP4fE&amp;feature=related">Lecture 5</a></li><li><a
href="http://www.cloudera.com/hadoop-training-mapreduce-hdfs">MapReduce and HDFS</a></li><li><a
href="http://www.cs.brandeis.edu/~cs147a/lab/hadoop-intro/">Introduction to Hadoop</a> at Brandies University</li></ul><h3>Quickstart packages</h3><p>As with many complex technologies, just setting up a working environment can be a challenge in itself. One that is enough to discourage the causal learner. To help alleviate the stress of setting up a general Hadoop environment to help you start working with Hadoop and the related cloud technologies, as well to help you gain some useful hands-on experience, here are a few resources to help you get a working Hadoop environment going fairly quickly.</p><ul><li><a
href="http://www.youtube.com/watch?v=Y3eL6DfNkTw">Introduction to Cloudera&#8217;s distribution of Hadoop</a></li><li><a
href="http://www.cloudera.com/hadoop-training-virtual-machine">Cloudera&#8217;s VMWare training image</a>, perfect for quick-access to hands-on examples preconfigured in <a
href="http://www.eclipse.org/">Eclipse</a> projects. Requires the free <a
href="http://www.vmware.com/products/player/">VMWare Player</a> which works great on Linux and Windows.</li><li><a
href="http://www.opensolaris.org/os/project/livehadoop/;jsessionid=76A7D3B6D9C487EB1D7075F8EE938FDE">OpenSolaris Hadoop LiveCD</a>, works great in <a
href="http://www.virtualbox.org/">VirtualBox</a>, can also install distribution to disk for a more permanent and dedicated development environment</li></ul><p>Helpful hint regarding videos: If you are like me and prefer to watch/listen to long lectures in your car or otherwise on the go on your netbook, iPod or other mobile device.  Try looking for the above mentioned videos on Google Video instead of YouTube. Google Video includes a helpful download link that allows you to take a copy of the movie with you.</p><div
class="betterrelated none"><p>No related content found.</p></div><ol
class="footnotes"><li
id="footnote_0_175" class="footnote">This article is a continuation of a recent article I wrote on the <a
href="http://werxltd.com/wp/2009/06/29/cloud-computing-101/">different approaches to cloud computing taken by Google and Microsoft</a></li><li
id="footnote_1_175" class="footnote">Hadoop was actually inspired by Google, more history and background <a
href="http://en.wikipedia.org/wiki/Hadoop">here</a>.</li></ol><p><!--[if IE]><iframe
frameborder="0" allowTransparency="true" class="addtoany_special_service facebook_like" src="http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F08%2F26%2Fgetting-starte-with-hadoop-and-mapreduce%2F&amp;layout=button_count&amp;show_faces=false&amp;width=75&amp;action=like&amp;colorscheme=light&amp;height=20&amp;ref=addtoany" scrolling="no" style="border:none;overflow:hidden;width:90px;height:21px"></iframe><![endif]--><!--[if !IE]><!--><iframe
class="addtoany_special_service facebook_like" src="http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F08%2F26%2Fgetting-starte-with-hadoop-and-mapreduce%2F&amp;layout=button_count&amp;show_faces=false&amp;width=75&amp;action=like&amp;colorscheme=light&amp;height=20&amp;ref=addtoany" scrolling="no" style="border:none;overflow:hidden;width:90px;height:21px"></iframe><!--<![endif]--><!--[if IE]><iframe
frameborder="0" allowTransparency="true" class="addtoany_special_service twitter_tweet" src="http://platform.twitter.com/widgets/tweet_button.html?url=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F08%2F26%2Fgetting-starte-with-hadoop-and-mapreduce%2F&amp;counturl=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F08%2F26%2Fgetting-starte-with-hadoop-and-mapreduce%2F&amp;count=none&amp;text=Getting%20started%20with%20Hadoop%20and%20MapReduce" scrolling="no" style="border:none;overflow:hidden;width:55px;height:20px"></iframe><![endif]--><!--[if !IE]><!--><iframe
class="addtoany_special_service twitter_tweet" src="http://platform.twitter.com/widgets/tweet_button.html?url=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F08%2F26%2Fgetting-starte-with-hadoop-and-mapreduce%2F&amp;counturl=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F08%2F26%2Fgetting-starte-with-hadoop-and-mapreduce%2F&amp;count=none&amp;text=Getting%20started%20with%20Hadoop%20and%20MapReduce" scrolling="no" style="border:none;overflow:hidden;width:55px;height:20px"></iframe><!--<![endif]--><!--[if IE]><iframe
frameborder="0" allowTransparency="true" class="addtoany_special_service google_plusone" src="https://plusone.google.com/u/0/_/%2B1/fastbutton?url=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F08%2F26%2Fgetting-starte-with-hadoop-and-mapreduce%2F&amp;size=medium&amp;count=false" scrolling="no" style="border:none;overflow:hidden;width:32px;height:20px"></iframe><![endif]--><!--[if !IE]><!--><iframe
class="addtoany_special_service google_plusone" src="https://plusone.google.com/u/0/_/%2B1/fastbutton?url=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F08%2F26%2Fgetting-starte-with-hadoop-and-mapreduce%2F&amp;size=medium&amp;count=false" scrolling="no" style="border:none;overflow:hidden;width:32px;height:20px"></iframe><!--<![endif]--><a
class="a2a_button_linkedin" href="http://www.addtoany.com/add_to/linkedin?linkurl=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F08%2F26%2Fgetting-starte-with-hadoop-and-mapreduce%2F&amp;linkname=Getting%20started%20with%20Hadoop%20and%20MapReduce" title="LinkedIn" rel="nofollow" target="_blank"><img
src="http://werxltd.com/wp/wp-content/plugins/add-to-any/icons/linkedin.png?9d7bd4" width="16" height="16" alt="LinkedIn"/></a><a
class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fwerxltd.com%2Fwp%2F2009%2F08%2F26%2Fgetting-starte-with-hadoop-and-mapreduce%2F&amp;title=Getting%20started%20with%20Hadoop%20and%20MapReduce" id="wpa2a_6">Share/Save</a></p>]]></content:encoded> <wfw:commentRss>http://werxltd.com/wp/2009/08/26/getting-starte-with-hadoop-and-mapreduce/feed/</wfw:commentRss> <slash:comments>3</slash:comments> </item> </channel> </rss>
<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Minified using apc
Page Caching using apc
Database Caching 1/26 queries in 0.105 seconds using apc
Object Caching 503/572 objects using apc

Served from: werxltd.com @ 2012-02-08 14:58:16 -->
