Stuff I Built in no particular order

Embedded Checkout Bread Finance

Merchants hate it when a service provider takes you away from their page to perform a transaction. At Bread, we provided a straightforward API to easily integrate any storefront experience with Bread Financing. The button can look like our default button, but is completely customizable for those who preferred a whitelabeled solution. screenshot

Go ReactJS PostgreSQL

Disaster Alerts in Google Search / Android Google

Cellular providers have been required by law to (finally) start pushing out timely public safety alerts to phones.  These alerts are usually in all caps and accompanied by an obscenely scary noise.  Using the data that the Public Alerts team is already ingesting, we wanted to supplant (or at least suppliment) these alerts with something less panic-inducing, and more information forward.  Our mission was to replace "FLOOD WARNING" with a map and detailed (decapitalized) information such as time, duration, depth, likelihood... all the stuff that matters when you need it most.

My particular contribution to this project was to take the alert text from sources like the National Weather Service and extract out the most relevant parts and recombine them to form a useful and informative snippet.  For example, on a Tornado warning I'd focus more on directional and temporal aspects rather than rainfall amounts.  This was done by tokenizing the text and weighting certain n-grams based on the alert type.  I also worked on the UI for both the Web, and Android.

These alerts were intended originally for only for Maps on the Android platform, but once demo'd we were asked to put them in Google Web Search, iOS Maps, and Google Now.

screenshot

Go Python C++ Javascript

Anti-Fraud Engine Bread Finance

Not only do fraudulent transactions cost the company money, but in many cases a failure to properly identify the individual to whom you are writing a loan to can cause some serious legal stickiness. This fraud engine collected all sorts of data throughout the checkout process like IP Address, Device ID, Geolocation, and velocity. In addition to that, we also ran verification checks on the addresses provided, the mobile number, and checked a ton of fraud databases.

All of these pieces of discrete data are collected throughout the process, and then just before you are able to complete a check out the fraud engine would generate signals out of all the data and run against a custom built rule set. The decisioning is completely concurrent and as fast as it's slowest rule. Given that most complicated rule can really only be a string comparison, we are able to evaluate over a thousand rules on a huge amount of data in under 200ms. It was pretty sweet.

Go

Cross Portal Search Admeld

Between Users, Ad Network Accounts, Creatives, Orders and a whole other mess of other things... finding a particular piece of information quickly was a trial for internal Admeld users. Cross Portal Search (affectionately known as 'Corporal Smurf' after a post-dentist talk with the CTO) was a data-type agnostic search engine to find whatever you were looking for.

This search engine indexed all the important metadata about every piece of data in our system to allow quick recall from virtually anything you wanted to search on. This was built in Java using Lucene, although if Elasticsearch were around at the time it would have made my job a lot easier. This is a nice segway in to the next item...

Java Lucene MySQL

Inventory Search isocket

isocket's core mission is to make it easier for Publishers and Advertisers to find each other and do business directly. Central to that goal is the ability to find inventory available on a particular site by virtually any number of vectors, such as comScore demographics, ad placement, price, IAB category... even weather - although no one has ever used that.
screenshot

Node.js Javascript Elasticsearch Java Thrift

Content Categorizer / Search Monitor110

As User Generated Content came on to the web, Monitor110 ingested content in to it's multi-stage processing service. Using a homegrown Bayesian categorization engine, articles are tagged with relevant stock tickers or category. Daily re-training of the engine actually refined the queries such that after a period of time we would see articles popping up in a topic that we had not put in a single training term for.

The best example of this was during the time of the first iPhone. Rumors were circulating that Apple was releasing a phone and the internet chatter had dubbed it the 'Jesus Phone'. Within a few days, articles mentioning 'Jesus Phone' and no other Apple-esque terms started appearing in the APPL ticker. We all ran away terrified as the now-sentient system threatened to blog mean things about us.

Java Lucene ActiveMQ MySQL

RTB Analytics Admeld

Insight in to how a Publisher was making money off Admeld's RTB platform was essential to selling the concept of RTB, as well as providing any shred of market intelligence to a typically black-boxed world.

The RTB Analytics platform leveraged our already existing (and ever growing) Hadoop-based data aggregation system. By slicing different types of data in to pivotable chunks, we were able to present clear and dynamic visualizations of how a Publisher was making money. (Screenshot hopefully coming).

Ruby Sinatra jQuery Hadoop Vertica Javascript

Web Automation Admeld

Web Automation grew out of two essential needs :

We built a system that would log in to various ad network reporting systems (very few had APIs) and scrape out relevant impression and payment data. This data would be used alongside our own reporting information to uncover discrepancies. As an example :

We would request to serve an Ad from both Ad Network A and Ad Network B. Ad Network A would tell us they pay $1 per impression, while Ad Network B would report they only paid $0.90. We would serve A. After scraping numbers from their reporting system and comparing them against our own numbers, we would see that Ad Network A would underreport the impressions - either for nefarious reasons or just lack of technical sophistication. We would know that we served 100, but they would report 80. This meant that we're getting paid only $80 for 100 impressions while serving from Ad Network B (who reported more accurately) would have netted us $90. The Ad Server would then work off our own internally derived CPM and favor the true higher price. Ad Network names have been removed to protect the innocent.

The system was built in a time before really cool stuff like Agouti existed, so we rolled our own solution. A service would fire up individually addressable instances of Firefox, which had ports open to allow us to inject our own Javascript in to the page. Every page the browser visited would then contain our own admeld.js which had methods we could call from the main Java scraping service. (i.e. pullTableColumns would exist as a Java method which then call the corresponding Javascript function through the JSSH port).

With this framework now in place, I built a runner that would use Groovy scripts that were custom built for each reporting platform. User credentials were injected in, and the service would return simple JSON results from the scrape. We were essentially API-afying a site and normalizing their data to fit in to our system in a clear and IAB standards-compliant way. It was easily one of our biggest wins, and often cited as a key reason Google aquired Admeld.

Firefox Java Groovy MySQL Javascript

Bezerk and Smoosh Admeld

Why "map" when you can Bezerk? Why "reduce" when you can Smoosh? In order for Admeld's ad servers to make intelligent decisions, we needed to know what they were doing.

Ad Server clusters in multiple co-locations dumped billions of lines of log data. More often than not, a single request would span multiple servers. Small (local) Hadoop clusters would grab 15 minute increments of log data from all the servers, and then combine these lines together in to a coherent RequestChain object which would represent only the relevant (non-duplicated) data points about that particular request. These newly smooshed objects would then stream in to the central Hadoop cluster for some serious crunching. At any given time, we were no more than 15 minutes off real-time with our reporting. This was huge, as most Publishers were accostomed to only getting this data every few days, or sometimes even a whole month. Combined with the Web Automation system, we were able to have a complete view of what was going on in what was quickly becoming a very very large ecosystem.

Hadoop Java MySQL

Power Outage/Gas Supply Maps During Hurricane Sandy Google

During Hurricane Sandy, the NYC branch of the Google Crisis Response team had a rare opportunity to operate in an area where we were actively responding as well.  Since we were able to work closely with NYC Department of Emergency Management and the NYC Mayors office, we were able to get access to power outage data we normally were never allowed to use and plot it on our Crisis Map.

There was a more interesting data point we were able to capture as well.  There were a lot of software engineers who no longer had to go to work for the week and were forming up under IRC and Twitter as "Hurricane Hackers" to try to help out in any way they could.  I was able to engage that community and Google provided resources to help them, and they in turn were able to hack together a gas supply data set that I was then able to plot on the Crisis Map.  While probably not in any way the most difficult thing I've worked on, it's the only thing I've ever built that I saw on CNN while I was coding it.

screenshot
article

Go Node.js Javascript Appengine

Magicbus isocket

Evented io / pubsub backend. TODO: FILL ME IN

C++ Thrift Java Node.js Javascript RabbitMQ Redis Go

Firemeld / Firemeld for Chrome Admeld


extension
product page

Firefox Chrome Javascript jQuery

Dogfort Side Project

Anti-social networking. As my circle of eng friends from Admeld got thrown to the diaspora, we could no longer all communicate via shared Skype session or on corporate GTalk. Also, most chat platforms were pretty weak on showing images and video. We tried our own private sub-reddit, and a number of other things and finally Dogfort was born. It had a few key essential features:

Built under cover of darkness, and usually while a little drunk - Dogfort has been through many iterations. Direct contact with Dogfort users provide a clean and efficient feedback loop for incorporating new features. Today's version has a voting system, comments, and a Chrome extension to allow for quick posting. Built using AngularJS on the frontend and Node.js on the back, Dogfort also gives me an awesome playground to try out new technologies before advocating for them in my paying gig.
screenshot

AngularJS Javascript Node.js MongoDB Heroku Express Firebase

This Resume Of Sorts Side Project

Not sure how it's going yet.

HTML Bootstrap

Where I Built It in a rather particular order

Bread Finance  Principal Engineer / Architect / Quartermaster 

New York NY Aug 2014 -> Present

Bread makes it easy to pay over time for the things you love. link

iSocket (acquired by Rubicon Project) Senior Software Engineer 

Burlingame CA, New York NY Mar 2013 -> Aug 2014

iSocket simplifies the direct sales process. Their technology automates the manual steps like order execution and monitoring, helping premium publishers and advertisers do more business directly.
link

Google  Software Engineer V 

Mountain View CA, New York NY Dec 2011 -> March 2013

Google.org Crisis Response Team
When disaster strikes, people turn to the internet for information. We help ensure the right information is there in these times of need by building tools to collect and share emergency information, and by supporting first responders in using technology to help improve and save lives.
link
screenshot

Google Public Alerts
Being prepared is just as important as knowing what to do in a crisis. Public Alerts provide a warning before disasters cause damage, so searching for "Hurricane info" in Miami when there is an alert for the East coast would trigger a Public Alert and give you time to prepare.
link
screenshot

Admeld (Acquired by Google) Senior Software Engineer 

New York NY Aug 2008 -> Dec 2011

Since 2008, Admeld has led the industry in helping premium publishers maximize their ad revenue and simplify their operations. Admeld pioneered the private ad exchange and built technology that made it easy for publishers to identify new opportunities and control how every impression is sold.
link

Monitor110  Lead Engineer 

New York NY Mar 2005 -> Jul 2008

Monitor110 gathered information from over 130 million sources of various types, ranked by financial market knowledge through a proprietary algorithm that takes 50 factors into account. Users could choose between top sources preselected for their market sector and subscribe to sources of their own. Static sites can be monitored for changes with good granularity. Premium subscription and other deep web sources, blogs, forums, news and regulatory filings are among the sources included. The end results are delivered through proprietary RSS reader with email, IM and SMS alerts as appropriate.
article
post-mortem
screenshot

Nat'l Reading Styles Institute  Consultant, Software Engineer 

New York NY Oct 2001 -> Mar 2005

NRSI is a research-based educational organization dedicated to improving literacy.
link (I swear I didn't build this terrible site)

SUSS MicroTec  Senior Software Engineer / Electrical Engineer 

Ste Jeorie FRANCE, Vonnigen GERMANY, Waterbury VT Aug 2000 -> Oct 2001

SUSS MicroTec is a leading supplier of process equipment for microstructuring in the semiconductor industry and related markets. Their portfolio covers a comprehensive range of products and solutions for backend lithography, wafer bonding and photomask processing, complemented by micro-optical components.
link

International Business Machines  Software Engineer 

Binghamton NY, Essex Junction VT Sep 1999 -> Oct 2001

IBM microelectronics delivers application-optimized semiconductor technologies designed to take performance, integration and power efficiency to the next level in solutions spanning mobile and wired, from consumer products to robots that do really well on Jeopardy.
link