Friday, November 30, 2012

More places to learn Hadoop

This is a follow-up to my earlier post on starting to learn Hadoop. I came across a couple of interesting links to learning Hadoop and increasing understanding on how it works -

Learning “Machine Learning” by Example is a Meetup which you can join in off-site. It is good starter place for someone who is new to analytics. It is a wonderful opportunity to learn the concepts underlying Machine Learning and is based on the “learning by example” principle.

Here is a well constructed diagram on how Big Data gets stored and retrieved to and from a distributed file system like Hadoop


Just came  across these tutorials and think they are awesome
Yahoo's Hadoop Tutorial

Tuesday, November 27, 2012

Sources for learning more on Algorithms and their applications

Algorithms are an enticing concept. When in grad school for Software Engineering, the hows, whys and the pattens are both mind blogging and intriguing. The history of algorithms, their application before the advent of computing, and in the present are all very interesting.

Here are some material that has been of great help -

Introduction to Algorithms   by Thomas H Cormen, Ronald L. Rivest and Clifford Stein

Algorithms in Modern Mathematics and Computer Science by by Donald E. Knuth

Algorithms by Sanjoy Dasgupta, Christos Papadimitriou and Umesh Vazirani

I am always looking for more, please feel free to comment

Thursday, November 22, 2012

Using MOOCs while in Grad School

Graduate school certainly invokes your curiosity. You are sometimes so intrigued that you want to learn more but it is not a part of your coursework. What do you do? Of course, you could look it up on the web, go to a library but you want more. You actually want to learn in a more interactive way. My answer is to take advantage of the Massive Open Online Courses (MOOCs) including udacityCoursera and edX. You get quality courses from highly qualified professors from some of the best universities in the world. I use it to supplement my coursework. You can even get certificates for some of the courses you complete.

For a graduate student the questions are -

·         how do I fit this in with all the other things I have to do while in grad school? My answer is to do just one course per semester of just do portions of a course that you want to. If you take more than you can chew, you will give up.

·         how do I make a choice among the many courses, they all seem so interesting? If you are going to use the course to supplement your coursework, read the syllabus and choose the ones that will augment your schoolwork. If you are doing a course that your school does not offer, but you want to do anyway, make sure you are willing to commit three to four hours a week for the course especially if it is learning something new.

·         how do I keep myself motivated to keep going while I have so many other grad school commitments? Know before you start you are going to have to face this point as you reach are midway into your grad school semester. Be ready to dedicate the time before you commit. One good time to start a MOOC course is during a semester break, this way you are likely to finish before the semester mid-terms.

·         how do I know I am learning what I set out to learn? Start with a check list, especially if your aim is to learn a specific concept or technology. Check yourself by doing a small project post learning. I know that is better said than done. Do it during your semester break.

·         how do I translate my coursework onto my resume? If you have completed the course, you get a certificate of completion. If you have done only specific parts of the course, a project you did using your newly acquired knowledge is the best bet. 

I know what I have said here is in no way comprehensive, feel free to add your comments and experiences. So far I have partially completed three courses - Stats 101 and CS 101 from Udacity to brush up my statistics skills and basics of CS, and Data Analysis from Coursera to learn R in a structured manner.

Wednesday, November 7, 2012

Be Assertive in grad school

In grad school, or for that matter anywhere some of us dismiss our findings as trivial, and watch others use it to their advantage. I had such an experience just yesterday. I located an error in the code handed to us by our professor. I ran a few tests and knew he was way off the mark. In my naivety I went ahead and shared this information with a fellow grad student. The smart cookie approached the professor with the error, got applauded for it and may have scored extra credit points as well.

Lesson learned : 
1. If you have tested an idea, share it with the world in such a way that you are the owner of your idea. You should benefit from the hours and hard work that went into the process.
2. Do not dismiss your findings as trivial. You make risk looking like a fool if you are wrong, but the taste of sucess if you are right is worth the risk.
3. If you do make this error, just remember you were smart once, you are more likely than the person stealing your idea to be smart once again !!!

Sunday, November 4, 2012

Data Scientists work to solve problems for the thrill

Is money the only motivator? Sometimes you just want to be challenged more than you want to be remunerated. You just want to ace it. Here is a data scientist example of this behavior in action.