The best analogies for technical things most often come from the completely non-technical sources.  This post on the Four Hour Work Week blog really resonated with me, especially given some of my current projects.

Push versus pull describes the fundamental choice that must often be made when integrating two (or more) connected systems.  I have struggled with this many times and have found myself over complicating things by coming up with elegant strategies to push data from system A to B and C and so on.  While doing so was leveraging appropriate technologies a simpler answer was staring at me the whole time.

Early on I had developed two interfaces which were pull based.  You want this data, come get it and I’ll give you the data you asked and nothing more.  Simple and effective though it put the burden on consumers to come get it.

Neither of those interfaces has failed to date (we have had problems with how they were integrated in a few systems).

In the course of some new projects I was tempted into over complicating them because I had modeled the whole scenario as pushing a message through our systems.  Thanks to some insightful tutelage from Clint Edmonson he pointed me right back to my pull model and reminded me of a quote I am very fond of.

"Doing old things in new ways"

A fundamental principle Clint teaches is that if systems need data from each other, they should ask for it.  This is some what of a reversal of the legacy approach where systems would often send data to another system that it knew needed it.

Take an example where you have a sales system that needs to get information from n number of vendors.  You could have an FTP site or web service where everyone can send their data.  That should work fine, until you really try it.  Then you end up with n number of file formats and n number of incoming FTP connections, user accounts, etc.  It’s a mess to maintain; ask anyone who has to.

What if you reversed that?  What if your company came up with a simple web service definition and each vendor put an instance of that service up and provided their data in the format specified by the web service?  Then your sales system simply has to go down its list of vendors and call them up to get the data?

Perfect, no problems ever.  Right?  Not quite.  Every solution will, hopefully, present you with new or different problems which hopefully have simpler solutions than problems created by other choices.

I mentioned all of this without bringing up publish/subscribe but that is the core of what we’re talking about here.  You have publishers of data and subscribers to that data.  The publisher provides a consistent, well-defined method of getting that data and subscribers come and get it.  It solves a lot of problems.  Service down, no problem, subscribers can catch up later.  Something out of sync, not a problem, the publisher can resend something if needed.

Push versus pull?  Right now I think things should lean towards pull.  It was a good enough model that the Department of Justice began developing WSDLs and asking partners to put up web services that met the specifications of that WSDL so they could come and get from those partners.  To date it has been very successful, at least in our implementation.