At Kirpeep, we were a recipient of the Geekdom Fund, which
gave us $25,000 as a convertible debt note and got us into the Rackspace Start-Up Program. With $2,000/mo in
Rackspace services, we’ve opted for a Managed Service Account. I’m hoping to put this to good use with as
much as possible of the configuration for my data server.
Rackspace
is pretty good about their support, and
I’m confident that most of the config work on my server will be easy. I’ll share the versions and specs of software
installed on the server once complete. In the meantime, here’s the diagram I’m
basing my analytics engine off of(thanks Foursquare!).
Do Anything To Analyze(DATA)
I'll be working on building an analytics engine, and here is where I'll chronicle my approach, issues, resources, etc. Keep It Real, People!
Wednesday, May 8, 2013
Thursday, May 2, 2013
Do Anything To Analyze(DATA) Intro
My name is Steven Quintanilla and I am the CEO of
Kirpeep.com. Kirpeep is an exchange
platform for people to buy, sell, and trade goods and services. Our goal is to prove the hypothesis that
money is not always the most efficient way to value things in real time, and
that you can quantifiably identify what an individual or community value in
real time by analyzing the parameters surrounding their exchanges.
To do this, we have created Kirpeep.com, a social economic
system based on what things are worth, not what they cost. By doing so, we align ourselves with one of
the underlying thoughts behind Kirpeep – the Transitive Property of
Equality. Namely, if a=b and b=c, then
a=c. For us, this translates into the
thought that if we provide our skills and services in exchange for money (ie a
job) and we use that money to buy other goods and services (ie concert tickets
and web design), shouldn't we be able to trade our skills and services directly
for the things we want and need?
I understand that this sounds a little whimsical, but I
personally value two things above all else: realness and loyalty. I haven’t tried to quantify loyalty – yet –
but I do have an idea on how realness can be quantified based on how we
exchange with one another. Simply put, value is subjective and changes from
person-to-person, time-to-time, and exchange-to-exchange.
Being as I went to MIT it’s probably pretty obvious that I
love numbers( I like letters too, I just prefer them in different arrangements
than most people). W00t. From a realist’s perspective, men lie, women
lie, numbers don’t. For that reason,
I’ll be building an analytics engine to analyze the data we call our “Value
Graph”. This data will include
demographic information on our users, as well as a plethora of other
information regarding the exchanges they made – and even almost or could have
made.
The overall goal of this experiment is to identify and
connect people with the things they want most(give them the highest expected
utility) when they want and need them, so as to maximize their buying power
with the things they have and help us reduce the World’s dependency on
currency.
Planning:
Being as this is going to be a massive endeavor, I feel a
little planning is in order. I won’t be
dealing with the application side of Kirpeep; this will totally be focused on
the analytics(Recommendation/Personalization) engine. It may not be ideal, and I’ll be documenting
the process as I move through it. I've broken down the necessary actions for this project as follows:
- Collect the Data
- Organize the Data
- Analyze the Data
- Report findings on Data
Data Server
- Hosted at Rackspace Cloud Hosting and built in Ruby
- MongoDB – key-value data stores(fast and scalable), and allows for data partitions.
- Resque - Ruby library for creating background jobs, placing those jobs on multiple queues, and processing them later. Uses Thrift to communicate with Hive.
- Redis – Fast, networked, in-memory, persistent, journaled, key-value data store. We turn to Redis for performance, so shouldn't we be monitoring it too? With the rpm_contrib gem installed you’ll see your Redis interactions show up under the Database tab in RPM.
- Thrift – Framework for scalable cross-language services development, combines a software stack with a code generation engine to build services that work efficiently and seamlessly between multiple languages.
Analytics
- Apache Hadoop – Open-source Map-Reduce framework for parallel data processing(Yum!)
- Apache Hive – Secondary service that allows you to interact with Hadoop by defining ‘virtual’ tables and using familiar SQL syntax(not totally sure on this one yet, but I’m going off of Foursquare’s recommendation – no pun intended).
- R – Free software environment for statistical computing and graphics
- Valgorithm – Kirpeep’s ‘secret sauce’
Strategy for
Suggestions
1.
Personalize for Current User(s)
·
Based on common attributes of items of interest
to User
·
Measure success of suggestions
2.
Recommend to New User(s)
·
Based on what Current Users like them have done
·
Measure success of suggestions
3.
Collect Data on New User and Personalize
·
Repeat (1)
Current Links of Interest:
Subscribe to:
Posts (Atom)