Going through TIME magazine’s list of ‘100 most influential people in the world’ is one of my favorite past times. One person featured in the 2013 list caught my eye – Sam Yagan one of the co-founders of the dating website OK Cupid. A note below his name stated that ‘What do you get when you combine big data, the quest for love and complete irrelevance? The hippest spot on the Internet: OKC!’
Though I knew that dating websites are using predictive/ big data analytics, I had never actually given a full thought about how they come up with statistical models to predict the compatibility between two people. So I started looking online about how OKC collects data. You may think that I am a nerd (or have no life in general) but the process of predictive analytics is my first love. I love to think of ways to collect data, pre-process it, come up with the best statistical models and optimize it. Well I won’t bore you anymore with my tech talk but it was amusing to know that the love of my life is used to help people find love.
Coming back to OKC, I read that they have an extensive online questionnaire system which people take and based on their responses they are matched up with various profiles. To put it layman’s terms the answers become the response variable or the output and the questions become the predictor variables or inputs. This input- output form is then mapped on to a mathematical equation. Basically when people give answers, the server takes in these answers converts into a suitable form, plugs it into an equation and outputs a value. This output value for each person is used to determine the compatibility between two people.
I have to admit that I am a data junkie. I get my fix by coming up with innovative ways to process the data and glean the story the data is trying to tell. Trying to know more about the data collection methods of OKC, was my fix for this weekend. So I registered on OKC in order to take OKC’s through and comprehensive questionnaire, to better understand the data that is being collected and to think about how I would go about designing statistical models for such data. I was also curious to know what kind of guy OKC would pair me up with. As I filled out the questionnaire I realized I had not been very truthful with my answers. I mentioned in my profile that I was looking for a casual fling (which is a blatant lie because I don’t have the guts to go through with it). I answered that looks are more important than a man’s smartness. Before you go ahead and judge me if casual fling is what I am looking for– I sure can judge the book by its cover. In the middle of answering 158th question an odd thought occurred to me — people may twist their answers in order to make them look appealing/ irresistible. But the algorithms can’t account for this human ambiguity and takes the answers as is. This introduces certain errors/bias in the data and could lead to wrong predictions. That is when I decided to stop answering the questions and delete my profile because I could not take the predictions, the system made for me at face value.
However this approach to dating reminded me of the middle aged ‘aunties’ in India who tried to fix me up with the so-called ‘perfect guy’. I am sure their approach at matching two people is not as sophisticated as OKC but the essence remains same. They collect data about a guy(s) and his family (his qualifications, job, salary, properties owned, how many times he goes to the temple, his family status etc.) by exchanging details (a.k.a gossip) in social gatherings and try to compare this detail with girl(s) they know of. Next they rank the girls based on what they determine as compatibility and start approaching the highest ranked girl in the list. If it doesn’t work out they try the second highest and process continues. So I concluded that OKC is a high tech, ‘amoral’ (Indian aunties don’t believe in setting up people for casual sex because they consider it amoral and would take offense if I compared the dating site to the ‘noble work’ they do) version of middle aged Indian ‘marriage broker’ aunties. I suppose these aunties’ approach to setting up people probably had some merit after all.
P.S. Sam Yagan went to Harvard and Stanford and used his passion for mathematics (he was a math major at Harvard) to help people not as lucky as him in finding love (he married his high school sweetheart), made money and got featured on the TIME ‘100 most influential people on the planet’ list. I applaud his efforts as it has helped bring many people closer.
P.P.S: I am extremely curious to know how my friends will react if I add OKC as my current employer on Facebook. May be this could be my project for the next weekend- building a statistical model to determine if my friends’ reaction matched up to how I thought they would react.