Wednesday, May 8, 2013

Configuring Data Server


At Kirpeep, we were a recipient of the Geekdom Fund, which gave us $25,000 as a convertible debt note and got us into the Rackspace Start-Up Program.  With $2,000/mo in Rackspace services, we’ve opted for a Managed Service Account.  I’m hoping to put this to good use with as much as possible of the configuration for my data server.  Rackspace is pretty good about their support, and I’m confident that most of the config work on my server will be easy.  I’ll share the versions and specs of software installed on the server once complete. In the meantime, here’s the diagram I’m basing my analytics engine off of(thanks Foursquare!). 




Thursday, May 2, 2013

Do Anything To Analyze(DATA) Intro


My name is Steven Quintanilla and I am the CEO of Kirpeep.com.  Kirpeep is an exchange platform for people to buy, sell, and trade goods and services.  Our goal is to prove the hypothesis that money is not always the most efficient way to value things in real time, and that you can quantifiably identify what an individual or community value in real time by analyzing the parameters surrounding their exchanges. 

To do this, we have created Kirpeep.com, a social economic system based on what things are worth, not what they cost.  By doing so, we align ourselves with one of the underlying thoughts behind Kirpeep – the Transitive Property of Equality.  Namely, if a=b and b=c, then a=c.  For us, this translates into the thought that if we provide our skills and services in exchange for money (ie a job) and we use that money to buy other goods and services (ie concert tickets and web design), shouldn't we be able to trade our skills and services directly for the things we want and need? 

I understand that this sounds a little whimsical, but I personally value two things above all else: realness and loyalty.  I haven’t tried to quantify loyalty – yet – but I do have an idea on how realness can be quantified based on how we exchange with one another. Simply put, value is subjective and changes from person-to-person, time-to-time, and exchange-to-exchange. 

Being as I went to MIT it’s probably pretty obvious that I love numbers( I like letters too, I just prefer them in different arrangements than most people).  W00t.  From a realist’s perspective, men lie, women lie, numbers don’t.  For that reason, I’ll be building an analytics engine to analyze the data we call our “Value Graph”.  This data will include demographic information on our users, as well as a plethora of other information regarding the exchanges they made – and even almost or could have made. 

The overall goal of this experiment is to identify and connect people with the things they want most(give them the highest expected utility) when they want and need them, so as to maximize their buying power with the things they have and help us reduce the World’s dependency on currency.

Planning:
Being as this is going to be a massive endeavor, I feel a little planning is in order.  I won’t be dealing with the application side of Kirpeep; this will totally be focused on the analytics(Recommendation/Personalization) engine.  It may not be ideal, and I’ll be documenting the process as I move through it.  I've broken down the necessary actions for this project as follows:
  1. Collect the Data
  2. Organize the Data
  3.  Analyze the Data
  4. Report findings on Data

Data Server
  •  Hosted at Rackspace Cloud Hosting and built in Ruby
  • MongoDB – key-value data stores(fast and scalable), and allows for data partitions.
  • Resque -  Ruby library for creating background jobs, placing those jobs on multiple queues, and processing them later. Uses Thrift to communicate with Hive.
  •  Redis Fast, networked, in-memory, persistent, journaled, key-value data store. We turn to Redis for performance, so shouldn't we be monitoring it too? With the rpm_contrib gem installed you’ll see your Redis interactions show up under the Database tab in RPM.
  • ThriftFramework for scalable cross-language services development, combines a software stack with a code generation engine to build services that work efficiently and seamlessly between multiple languages.

Analytics
  • Apache Hadoop – Open-source Map-Reduce framework for parallel data processing(Yum!)
  • Apache Hive – Secondary service that allows you to interact with Hadoop by defining ‘virtual’ tables and using familiar SQL syntax(not totally sure on this one yet, but I’m going off of Foursquare’s recommendation – no pun intended).
  • R – Free software environment for statistical computing and graphics
  • Valgorithm – Kirpeep’s ‘secret sauce’

Strategy for Suggestions

1.       Personalize for Current User(s)
·         Based on common attributes of items of interest to User
·         Measure success of suggestions
2.       Recommend to New User(s)
·         Based on what Current Users like them have done
·         Measure success of suggestions
3.       Collect Data on New User and Personalize
·         Repeat (1)

Current Links of Interest: