A Bolt of Lightning in the Dark

Part III of my story: How iterative workflow improvements helped me keep my sanity

Saturday, September 10, 2016

In Part II of this essay, I discussed our first attempt to solve a major problem with our workflow -- receiving Excel files intended for use as a form -- and both what went right and what went wrong.

This is the ongoing story of how I solved my problems, from the technical side. I am a former editorial systems editor at the Orlando Sentinel, and a real geek when it comes to problem-solving, so this felt like it was tailor-made for me. I didn't have to hire a consultant or pay thousands of dollars a year in software.

Chapter I: An Anatomy of a More Durable Solution

Our old, Acrobat submittable form was broken due to a firewall change that IT made. I'm not part of the IT department, so I didn't have much leverage, but in any case after discussing the situation with the VP of IT, it wasn't even clear that it would be fixable by us... or at what cost?

We had just cleared our monthly wave of requests, so I had about three weeks to implement a fix. It didn't take long for me to come up with a pretty clear set of requirements:

  1. Must be possible to support the XFDF-based workflow on our end. We're very particular about what information is captured, and it's important to us that our job jackets fit on one sheet front and back.
  2. End-user functionality must rely on standard software available on Windows 7 or Web functionality, with no additional dependencies or requirements.
  3. Cannot be blocked by IT firewall upgrades, or blocking must be easily and quickly whitelisted.
  4. Reliable logging of requests at point of entry, so that even if other pieces fail, the data of the request can still be retrieved.

So I did what anyone else would do under the circumstances: Google research. I spoke with vendors who offered us outrageously expensive solutions -- hundreds of dollars a month, thousands of dollars a year -- all targeted at very large teams or extremely complex workflows.

The problem is that we're a small, in-house team. At the time, there were two of us. I couldn't come up with an ROI that made sense for such a small team at a price of more than a few hundred dollars a year. After all, our team is small, physically located in just one office, and uses a paper-based flow as our primary workflow (mostly because we're all in the same place). So elaborate workflow-based solutions didn't make sense to us.

One other thing: I really wanted to keep my XFDF-based workflow if I could. I was happy with the output, and it gave us one really nifty piece of functionality that I didn't think anyone else could. If I wanted to change the design of the form, for any reason, I could do so at any time and as long as we didn't change the underlying data fields, everything would just work.

Unfortunately, no one other than Adobe appeared to sell anything relying on XFDF (or even plain FDF, a binary format) in 2015. A lot of the solutions would load form input into a database, but not necessarily provide a robust way out. And many solutions offered printable forms that were just a literal translation of the web form -- whereas we had a highly customized printable form that we were happy with.

Oh, and no, Adobe LiveCycle wasn't an option.

One really important thing to me was that the experience of the last failure had taught me not to rely on services we created ourselves and hosted internally. As much as possible, I wanted to rely on standard web services that I could count on being available 24/7; if I was on vacation and our internal web server went down, I didn't want the entire process coming to a standstill until I got back and could get it up and running again.

I kept coming back to the same problem, again and again, but I finally hit on a beauty, like a bolt of lightning. (That's a play on words.) What was the answer? Zapier

Zapier is essentially the glue that holds together multiple web services, most of which have APIs for inbound and outbound requests but not a lot of support for the middleware that connects it together.

With Zapier, I could plug lots of web services together so that I could have my cake and eat it too. I could add a web form, keep my XFDF-based workflow, add a web-based job management system to help us keep track of workflow states and deadlines, add Slack notifications and a whole lot else.

I just had to figure out what other ingredients I needed to bake the proverbial cake.

Chapter II: Sifting the Bran from the Wheat

After a lot of research and trial-and-error, what I ended up settling on was fairly straight-forward:

Zapier's robust library of integrations meant that it worked beautifully with Google Sheets. Using Google Sheets itself meant that it would be possible to replace JotForm, later on down the line, if we wanted to or needed to.

Best of all, Zapier can both receive a notification of a new entry from Google Docs, and send that notification to an arbitrary web service.

I'm getting ahead of myself a bit, but we've ended up using Zapier for all sorts of stuff!

Now, I wasn't crazy about having to write that custom XFDF application. But it's relatively disposable. In the event of a complete failure, we can print out the JotForm input, and we have a backup because the job is also automatically created in Asana. Not a key part of the workflow.

Chapter III: Here's How I Did It

OK, I'm a huge fan of Jacques Pepin. I think I've used this reference before, but he always starts his shows with an example of something that looks wonderful and is quick -- and then he says: "Here's how I did it."

For those of you who aren't super-technical nerds like me, this is probably not going to be very interesting. But if you've ever had a weird technical challenge like this, this might be pretty compelling.

We decided to adopt Asana for job management and JotForm for the front-end entry. I liked JotForm best because it could both send its data directly to a service (useful in other contexts, like Mailchimp) and to Google Docs to be stored in a spreadsheet.

There was only one problem with all this, which I hadn't anticipated. No matter how hard I tried,

From there, the actual process of building it out went something like this:

  1. Analyze the data output from JotForm to Google Docs.
  2. Set up the Google Doc-to-Zapier inbound integration. This was really easy -- it's point-and-click.
  3. Create the Zapier-to-Asana integration. Again, this was point-and-click.
  4. Create the outbound Zapier webhook to generate the XFDF file. Because I didn't know what the data POST'ed to my service would look like, and honestly the Zapier documentation was pretty thin (it's since been beefed up a lot), I sent the webhooks to RequestBin first.
  5. Create the Slim application that would take the webhook, sanitize the data, and output an XFDF file.

Steps 1 through 4 took a couple of hours. I futzed around with it a bit more, but I was really just tinkering around the edges. Without Zapier, I think it would've taken weeks just to get to a place where it mostly worked.

Again, three cheers for Zapier.

I will say that, due to the relatively more limited functionality the Zap Editor had in late 2014 when I created these workflows, I had to make a separate zap to Asana for each type of request that I expected to receive. This requirement wouldn't exist now; I'm sure I could insert some logic or execute custom JavaScript. But that wasn't available then.

All the same, this was made extremely easy by Zapier.

Even Step 5 wasn't that challenging when you consider the structured input you get from Zapier: An array in Slim's $app->request, along with with any query string parameters I specified when I created it in Zapier.

Originally, the intent was to create a system whereby I could reuse the service for some other purposes. I had this crazy idea that I would actually make a generic service to create XML from HTTP POST, and submit the template for the XFDF file and the e-mail output via the POST request. But I ended up having to do a sufficient amount of customization that in the end, this didn't really make sense.

A few legacies of that are enc, which is required for XML files; prefix, which I actually do need, in order to strip off the gsx$ that Google appends to all form data on API output; and emailto, which allows me to specify how the e-mail with the XFDF file gets distributed.

If anyone reading this has an idea for how this could either be turned into a regularized service -- if it would even be useful to anyone else -- I'd be happy to share the broader code.

But here's the meat of it, which turns the data Slim is storing in $app->request->getBody() into an XFDF file:

    // set up XFDF header
    $xfdf_data='<?xml version="1.0" encoding="'.$enc.'"?>'."\n".
    '<xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve">'."\n".
    '<fields>'."\n";

    foreach ($array as $field => $val) {
            $xfdf_data.='<field name="'.$field.'">'."\n";
            if (is_array($val)) {
                    foreach ($val as $opt)
                            $xfdf_data.='<value>'.htmlentities($opt,ENT_XML1,"UTF-8").'</value>'."\n";
            } else {
                    $xfdf_data.='<value>'.htmlentities($val,ENT_XML1,"UTF-8").'</value>'."\n";
            }
            $xfdf_data.='</field>'."\n";
    }

    $xfdf_data.='</fields>'."\n".
            '<ids original="'.md5($array['formlocation']).'" modified="'.time().'" />'."\n".
            '<f href="'.$array['formlocation'].'" />'."\n".
            '</xfdf>'."\n";

The rest is easy as pie: Write the file and fire off a request to the Mandrill API to create the new email with the attachment.

VoilĂ !

Chapter IV: Could This Be Simplified?

As Zapier has continued to evolve and become a LOT more robust, with complex, multi-step workflows and if/then controls, one of the questions that's been nagging at me is if I could simplify the process a bit more.

We've actually had to solve one key point of failure already. Our team is Mac-based, but we only have access to email in Windows, via Citrix, for security and HIPAA control reasons. So we're very heavily dependent on our ability to access files via Windows. Over the summer, IT made a change to our Windows 7 platform that broke the XFDF file association, so that those files would no longer open correctly in Windows.

So we had to scramble a bit to find another way. The answer, it turned out, was to simply send the XFDF files to a Slack channel instead of Windows.

(Oh, but the, Slack changed the MIME type of files ending in .xfdf to text/xml as part of a server upgrade. That was fun, because if we clicked on the file in the channel, it would simply download the XML as text. And Acrobat won't import anything that doesn't exactly match its spec. We filed a bug report, and Slack was very nice about fixing it, but I had to copy and paste every new request to create the XFDF files, manually, until it was fixed.)

Periodically, I still search for the possibility of a solution out there that would allow me to stop hosting a custom application to generate the XFDF files.

Surely, like the Fermi Paradox, it's statistically unlikely that there isn't a solution out there somewhere.

But so far, I haven't found it.

Now, that's a lot about technological solutions. In fact, like "The Lord of the Rings," the middle book in this enormous essay is probably the longest and slowest, and bogged down a lot by details.

There's hope on the horizon. I promised, in the first installment, that some of this would focus on how process improvements made my work life simpler. We haven't gotten there yet, but we will. Look forward to that in Part IV.


Earlier: Malformed XML File Cannot Be Parsed