[Reblogged] A (semi) Unbiased Look into the Goodreads Vs. Booklikes Debacle

9:01 pm 9 October 2013

I asked my husband, an IT professional with over 10 years of experience, to give me an unbiased look (or as unbiased as he could get) into the latest fuckery from Goodreads Vs. Booklikes. He wrote me the following to post here. Be warned, he's kind of a shitty writer (I love you, Jon!)

"My wife asked me this evening about the Goodreads debacle where reviews and other things are disappearing. Goodreads is apparently blaming the issue on Booklikes due to an improper call to an API. She asked if this was at all possible and how it could happen, or if this was a bullshit excuse from Goodreads. In short, I’ll make a quick statement to say that none of us have anyway of actually knowing beyond speculating because the code is not open source. I’m going to attempt to do my best to explain this, but this covers some extremely complicated issues. I’ll give as concise crash courses in the understandings as I can. But again, difficult concepts to explain here without a bunch of schooling or studying. I’m giving my (essay) explanation here in an unbiased point of view possible.

First, let me explain myself by saying that the code is not open source (to the best of my knowledge). I have to explain first what an API is. An API (Application Programming Interface) is an encapsulated way for two, or more, pieces of software (or services) to communicate with each other. It’s because of APIs that we can have a ton of Facebook and Twitter apps for our phones. An API is a set of program instructions that allow one program to perform an internal specific function of another program (Eg. Post a Facebook status from your Twitter app). The API call in question here is the review.destroy call (which documentation has been disabled from currently, but evidence can be found of it from some simple Google searches). That API becomes important in a bit.

Also pay close attention to that word ‘encapsulated’. That is an object orients programming mechanism used to hide data from one part of an app from another part of the same app. When an app needs to call a function from an app, it’s supposed to only pass the data along that needs processed and get a result back without being able to actually see how it is being done. It’s a normal convention in the way of OOP.

Knowing that, APIs are technically encapsulated to a degree. In other words, Booklikes could call on an API from Goodreads to perform a specific task without seeing how that task is performed. Likewise, Booklikes doesn't have to tell Goodreads how it’s going about handling that API call past giving that API the information it needs to process. Keep in mind, these API calls are extremely specific. It’s not as simple to say that review.destroy was deleting books (because it’s a very specific function meant to delete a book) but how and why it was being called and used. It might be easier to think of that API call as being a completely mindless robot that only follows extremely unambiguous instructions. (Eg. You can’t simply say go cut the grass. You have to say I want every piece of this specific color of green grass cut at this current, or greater, specific length cut – and that can sometimes be to ambiguous.)

Digressing back to encapsulation. The point is that Booklikes can’t see how Goodreads is doing something any more than Goodreads can see how Booklikes Is doing something. It goes both ways. They can only talk through that API with extremely specific instructions.

We also have to keep another point in mind here. Booklikes is doing all this in the name of synchronization. Syncing content is an extremely complex subject and the science of it is still very much in its infancy. To this point, computer scientists have only figured out very crude ways of handling it. It’s the reason why iCloud is kind of crappy and Gmail email syncing is very slow or why when you access a document from Sharepoint and someone is currently editing it, that you get nasty errors. The complications with data synchronization could be a lecture in itself but plays a very real role here.

It should be stated that designing and building software is a very complicated thing. Programmers and engineers architect and layout their projects long before a single line of code is ever written. With that said, this review.destroy API call was probably thought of very early on. It’s not beyond a company to include an API call to delete information. In some cases, it’s needed. But this API function was designed and thought up very early on. I bring this up because it may have been a function that they originally wanted or needed at some point, but they may no longer want people to use. It’s very difficult to deprecate API functions publicly because that tends to kill other peoples apps and create really bad blood. Usually when API functions are terminated or deprecated or updated, there is a lot of notice beforehand. Goodreads may have this function around for a lot of different reasons. Problem is, we don’t know why exactly. Could be they need it. Could be they are allowing it for the sake of ease for developers. We don’t know.

This all comes back to my point that because we cannot see the code and analyze it, we don’t know who is to blame. There are two possibilities (assuming this whole incident is because of this API function). Goodreads could be screwing up somewhere and calling that function internally when not needed (a bug). It could also be the same for Booklikes; they could be calling the screwing up and calling the function when not needed. However, that does not explain WHY the API is being called upon for people who did not have Booklike accounts at the time of review deletion.

To bring all of my points together (the droning on about the syncing and the deprecating API functions), let me try and put it in an example like this:

Let’s say that Booklikes has a function that will sync, or change, a review on Goodreads when it is updated on Booklikes. But before they sync that information, they have to see if a review currently exists for the same book already. They might have a function that says "go look for this book review and if it exists, delete it and put this in its place." They might be passing the wrong information though and the wrong book Is identified as existing, so it gets deleted and the new book review is posted. But because that new book review is for a different book, Goodreads posts it under a different book. Keep in mind that these calls are extremely specific so the information being passed for what to look for might be wrong.

Using the same example, it could go the other way. Booklikes might be passing the right information along to Goodreads to see if a book review currently exists. Because, by some mistake or another, the data isn't encapsulated properly, or semantically the algorithm being used is just bad, review.destroy function is called and a book is deleted.

To put a finer point of ambiguity on it, the algorithm being used to decode the book title on either side could say "find Harry Potter" but is passed along as "harry potter." Those lower case letters make a difference and have to be handled. They could have a certain degree of “close-enoughness” to say that is indeed the book and identify it wrong.

That was a very crude example of how this could all be messed up. But again, without seeing the actual code we have no idea who screwed up in this case. All we know is that Goodreads allowed the ability for data to be deleted through its API. We don’t know how that works behind the scenes at Goodreads, nor do we know how Booklikes is using that API."

Booklikes Goodreads IT stuff that makes no sense to me Yip Yip Yip Nope Nope Nope

Karlynp & The Doggone World

Blog

Shelf

Booklikes Tutorials

Favorite books

Recently added

[Reblogged] A (semi) Unbiased Look into the Goodreads Vs. Booklikes Debacle