Open Letter to Google App Engine: Billing for CPU time is Wrong

| | bookmark | email | 12 comments

I've been planning to write this post since Google App Engine firstly announced a billing system for App Engine. Meanwhile, I thought it would have been more useful to show some problems to the system hoping that the team will reconsider some of the quotas and the way the billing system is supposed to work.

The Google App Engine platform has been offered since the beginning under specific CPU, space and API quotas. Leaving aside the tons of applications that have been deployed on App Engine for playing or testing purposes or just for the coolness factor, I strongly believe that others have evaluated the alternatives and have picked up the platform to develop true applications and while doing so they have considered the original limitations/quotas.

Later on Google has previewed what would become during February the billing system. And I don't think this was a surprise to anyone (in fact I would have expected to see the tons of open bugs fixed or at least explained before seeing a billing system, but this is probably just how I see things).

CPU time, defined as

The total processing time for handling requests, including time spent running the app and performing datastore operations. This does not include time spent waiting for other services, such as waiting for a URL fetch to return or the image service to transform an image.

is part of the billing system. As you can notice right from the definition it includes internal API time. Basically, this means that you'll have to pay for something you have no control of (a very simple parallel to other pay-per-use services would be Amazon asking you to pay for their sporadic hardware replacements).

GAppEngineDataStoreTimeout.png

See note 1

Meanwhile, the Google App Engine forums were (and still are) full of reports of the internal infrastructure misbehavior and this having a clear impact on the applications' reported performance. I should also mention that there were cases when the Google monitoring tools were not even catching themselves the issues, not to mention that the team is failing to provide any real feedback about these problems.

Last but not least, Google App Engine is also reducing the CPU time original quota based on so called resource usage statistics for a recent 7-day period. But they fail to mention if the average was computed based on active apps only or using all the test applications that cannot be deleted or how recent these statistics are considering I've seen the same number mentioned couple of months ago.

Summarizing, I think billing for CPU time is wrong and I am suggesting the Google App Engine team to reconsider it because:

  • the terms are not well defined [2]
  • it is not clear how they are measured
  • there have been repeated problems on the platform and these are impacting the CPU analytics
  • it includes framework API internal calls CPU usage and this is not under developers' control
  • framework API calls are already billable separately

[1] The picture shows a datastore timeout error for an extremely basic 15 record fetch operation (log record from Jun.7th)

[2] While the cpu_time metric is defined on the Quotas page, the logs are including other CPU related metrics which are not clearly defined.