All Your Base 2014: A database conference

I attended the “All Your Base” database conference in Oxford on September 17th, 2014. This was a single-track one-day conference. Below are my notes, with links to videos of the talks:

First talk: Jeni Tennison, Technical Director at the Open Data Institute (video)

  • Distribution of information
  • How can everyone benefit from the data reformation?
    • What data can you get?
      • Open data; there’s quite a lot of it but should be more
        • Government data; leading the way because it’s quite easy to collect data for public good using public funds
        • Philanthropic organisations
        • Private sector data
          • Often closed because it’s commercially sensitive; what’s useful to the public could be used by the competition
        • Crowd-sourced data, e.g. OpenStreetMap
        • Data you own; your personal data that law gives you access to
    • What can you do with it?
      • Tooling to combine, aggregate and analyse data
      • Start simple and accessible - Excel is almost a de-facto option
      • Visualisations, e.g. CartoDB for mapping
      • Yahoo! Pipes
  • Publish open data
    • statistical summaries
    • reference data
    • accessible, usable, shareable
    • so that everyone can benefit from the data reformation
  • Make tools that people need
    • visualisation
    • combination/aggregation
    • not just for developers
  • Up to us [technologists?] to make data available to everyone

Audience questions

  • How to handle technical illiteracy or target those without access to technology?
    • Target those without technical access indirectly via an intermediary

Further reading/watching

Second talk: Yodit Stanton, Founder at OpenSensors.io (video)

  • “Scaling the internet of things”
    • OpenSensors.io = clojure shop
    • Connect an object to a web service, e.g. React or D3
    • Monitor your cat
    • Connect an object to one or many other objects
    • Which protocols to use? HTTP can be a little bloated
      • CoAP
        • Like HTTP but lighter
        • Uses UDP
        • GET, POST, PUT, DELETE
      • MQTT
        • PUB/SUB protocol
        • Tiny packets
        • Suitable for low bandwith devices
        • TCP
        • Uses topics (like RabbitMQ)
        • Security: Username/password auth; optional SSL
      • MQTT many-to-many; CoAP one-to-one
    • Azondi engine; open-source
      • Project Reactor-based broker for MQTT
      • Uses the LMAX Disruptor Ring buffer
  • Don’t build an ‘Intranet’ of things, i.e. proprietary interconnectivity each with its own protocol
  • ChatOps - using chatops to turn off your kitchen light or shutdown your Roomba

Audience questions

  • Importance of security?
    • e.g. knowing when you’re on holiday by monitoring your home’s temperature
    • must be taken seriously from day one
      • Yodit: Yes it’s important, needs to be considered

Third talk: Ian Plosker, Co-founder and CTO of Orchestrate.io (video)

  • “Databases, The Cloud And Its Discontents”
  • Software has evolved based on the constraints of hardware
  • “Databases in the Cloud kinda suck”; not easy, doesn’t always work the way you’d hope
  • Orchestrate - essentially NoSQL via HTTP
  • Sigmund Freud: Minds forged before civilization and thus sometimes mal-adapted
  • Ian Plosker: Databases were forged before the Cloud and thus mal-adapted to the Cloud
    • Path dependence: choices made due to past constraints which are no longer relevant today
  • Assumptions that storage engines make
    • Disks are local
    • Disk is spinning media
    • Memory pages are contiguous
    • Kernel is omnipotent
    • Records have a repeating form and a consistent size
  • Assumptions distributed databases make:
    • The network is reliable
    • Nodes in a cluster share a switch
    • Nodes ins a cluster are in the same datacentre
    • Switch ingress/egress buffers never fill up
    • Networks are not congested
  • To understand present choices, we must understand the past
    • Writing, paper and libraries: (600 BCE, 105 CE, 2600 BCE)
    • Punch card databases (~1800)
    • Drums (invented 1930, general use 1950s); ‘attached storage’; predecessors to hard drives
    • Disks (invented 1954, general use 1960s)
    • Solid state (invented 1950s, general use 2000s)
  • Review of storage and query models
    • File systems (proposed 1958, general use 1970s); e.g. ERMA Mark 1
      • Not efficent for storing data
        • Blocks are an inefficient abstraction
    • Two dominant models in 60s-70s
      • Navigational model; can span storage media, can be really large
      • Hierarchial model
    • Enter the Database Management Systems (DBMS) (1960s, general use 1970s)
      • Relational model
      • Needed btrees to be invented
        • Improved time to search a file
        • Reduced insertion/deletion complexity
        • O(log N) for insert/update/delete/search
    • Summary:
      • 50s, 60s: constraint is size of storage media
      • 70s: Btrees arrive
      • 80s: Relational database systems; PC era
      • End of 80s: fight for relational database market
      • 90s: End of hierarchial/navigational database systems
      • ‘Web 1.0’: Static sites; more interactive in late 90s
      • 2005: Very few choices for databases; all SQL, all btrees, very homogenous, all employ same tradeoffs
      • 2014: 25+ databases in Production that didn’t exist 8 years ago
  • Enter: The Cloud
    • Network is not reliable; congestion
    • Databases not really fit for the Cloud
  • Running out of time; “Orchestrate is the solution”

Fourth talk: Laine Campbell, Chief Unicorn and former CEO & Co-Founder of Blackbird.io (video)

  • Laine == Database architect
  • “Scaling MySQL in AWS”
  • RDS and EC2/MySQL: A love story
    • Amazon RDS: Operations managed, easy of deployment, scaling via replication
    • Multi-AZ failover
      • Auto-failover
      • Block-level replication
  • Cost:
    • Instance tax: 35-37%
    • Provisioned IOPS: 15%
    • Lack of transparency can increase downtime
      • Can’t do dtrace or tcpdump
  • Cost over non-RDS: $15k USD
    • DBA costs $144K/year = $108 hour (time off, productivity, retention, churn)
    • Automating EC2 is a one time job, RDS tax is ALWAYS
    • Extra cost of RDS = 51 weeks of a DBA working
  • Lock-in:
    • In 5.6, you can replicate out, making this moot
    • Lack of visibility:
      • dtrace, tcpdump, top, vmstat etc.
      • i.e. is NTP in sync? can’t see without asking Amazon
    • Lack of control:
      • data security, shared environments, backups?
      • restarts due to exploits, etc.
  • Running on EC2 without RDS
    • OpenStack with Trove, open source
    • Tesora Trove; DBaaS; community edition is free
    • RDS versus EC2
      • RDS for legacy and less demanding SLAs
    • which version of MySQL? MariaDB? XtraDB?
    • storage option charts: ephemeral SSD gives max throughput of 390MBps
      • free with instance cost
      • provisioned IOPS, etc.
  • Definition of scaling:
    • capacity is elastic, automated
    • consistent performance
    • etc.
  • Workload scaling:
    • does it need to be relational?
    • break out workloads into their own clusters
      • facilitate sharding large data-sets
      • segregate specific workload characteristics
    • evalute workload’s read/write needs
      • total data dataset size and growth
      • data change
    • Constraints:
      • write I/O
      • replication limits single-threaded pre 5.6;
      • parallel per schema in 5.6
      • operational difficulties (lack of experience)
      • Summary
        • main constraint is write IOPS
        • sharding eliminates constraint
        • ways to reduce reads:
          • caching
        • reduce writes:
          • throttling concurrency with queueing and loose coupling to keep write IO down
          • compression: native or application-based (prefer application-based)
    • failover scenarios
      • Amaozon region failure - 1 to 2 times/year
      • failover to another region and bring up application services from snapshots in the event of a region failure
        • bring up slave replicas on failover event
        • 30m downtime but event is rare
  • Cluster management
    • Roll your own…
      • configuration management
      • automation scripting
      • HAProxy
    • Spend some dough
      • RDS
      • Continuent Tungsten
      • Scalearc
    • Bleeding edge “but clotting”
      • Galera, via MariaDB or XtraDB Cluster
      • Works really well; close to Production-ready, running healthcare.gov
  • Resiliency
    • EBS snapshots
      • Rapid redeployment of failed nodes; kill and rebuild
  • Operational visibility
    • Monitoring and alerting: Sensu (Nagios not good for ephemeral environments)
    • Time-series trending: Graphite or OpenTSDB (HBase)
    • Graphing of data: Grafana
    • Logging: Logstash or Splunk
    • Application analytics: Relic or AppDynamics

Fifth talk: Patricia Gorla, Cassandra Consultant at The Last Pickle (video)

  • “NoSQL Data Modelling in Apache Cassandra With CQL”
    • Case study of how to build out a data model
  • What’s Cassandra?
    • Created by Facebook, open sourced in 2008
    • Distributed, by hash; nodes are rack- and datacentre-aware
    • Eventually consistent (not ‘hopefully consistent’)
    • Consistency level for reads:
      • quorum
      • one
      • all
      • any
    • Fast; append-only
      • 2.1 - 190,000 wps
      • 2.0 - 105,000 wps
    • CQL = Cassandra Query Language
    • cqlsh CLI tool
  • The hard part: data modelling
    • no concept of joins in Cassandra
    • all data denormalised
    • data will be duplicated
      • how to manage updating data in multiple places
    • identify usage
      • schema can develop but can save headaches by planning in advance
  • Case study: City bike share application
    • List properties of a bike:
      • is it damaged? where is it? is it checked out (in use)?
    • Can the bike be checked out?
      • traditional method: query inside a transaction
      • Cassanda might use a 2-phase commit Paxos ‘lightweight transition’; don’t use for performance-sensitive tables
    • Get customer details
      • Address would be serialised (sub-columns but can be directly accessed)
    • How many bikes available at a particular station?
      • separate table bikes_at_stations_count
    • List trips a customer has taken
      • correlate bike ID to trip ID?
      • flaw: all trips for a bike would be stored in the same partition
        • add abritrary ‘bucket’ column
    • User data: use WITH CLUSTERING ORDER BY to find most recent
  • Conclusion:
    • There is hope
    • Understand access patterns
    • Be mindful of the storage engine

Questions

  • How do you guarantee uniqueness?
    • You can’t; Cassandra is append-only
    • Uniqueness must be enforced by application; e.g. UUID

Lunch


Sixth talk: Ben Foxall, Developer at White October (video)

  • “Data - A Post-prandial Demo…”
  • Not a traditional talk [notes may be sketchy]
  • Collecting transient things from multiple devices
    • Go to http://bit.ly/aybstuff to vote on questions; click button to vote
    • Using D3 to visualise the data
      • Dots light up on visualisation
    • Convert to bar graphs, heat map showing correlation between responses
    • Enter your Twitter handle
      • Graph followers, followees; use Twitter avatar in visualisation
    • Firebase to store data
    • Tweet each other; show connections as a graph

Seventh talk: David Mytton, Founder & CEO of Server Density (video)

  • “Stories from Production - Scaling, failure and (avoiding) data loss”
  • Server Density does web site monitoring
  • Production war stories from Server Density
    • Architecture
    • Performance
    • Where to host?
    • Downtime
    • Preparation
  • ~100 servers running Ubuntu 12.04 Precise
  • 50:50 virtual/dedicated
  • 200Tb/m processed data
  • Nginx, Python, MongoDB
  • Softlayer 1TB RAM >5GB SSDs
  • Advantages of virtualisation:
    • Easy to manage
    • Fast boot
    • Easy to resize/migrate
    • Templating/snapshots
    • Containment
  • Disadvantages of virtualisation:
    • Another layer
      • Hypervisor overhead
      • Host contention
      • Poorer I/O performance
  • Bare metal advantages
    • Dedicated resources
    • Direct access to harware
    • Customisable specification; not constrained to a pigeon-hole
  • Disadvantages of bare metal
    • Build/deploy time
    • Lead time can be weeks
    • More difficult to resize (need to replace physical hardware)
    • Difficult to migrate/snapshot
    • High capital expenditure; lifetime (less efficient as it gets older)
  • Performance problems?
    • Move to bare metal
  • Key performance factors
    • Network; round-trip time (replication lag); consider speed of light (fibre optic connections)
    • Memory; much faster than disk but expensive; SSDs are cheaper
  • Why use the Cloud?
    • Flexible
    • Unlimited resources
    • Cheap to get started (no/little CapEx)
  • Why use colocation?
    • Vastly cheaper
  • Cloud outages
    • Amazon US East region ‘service event’; excellent post-mortem reports
    • No matter how much you spend, there are always outages
  • Preparation
    • Rotations (on call)
    • Reachability (no phone signal? ‘do not disturb’? phone upgrade?)
    • Work the next day after responding to an alert? Take time off next day.
    • Documentation:
      • Searchable
      • Easy to edit
      • Independent of your infrastructure
    • Unexpected failures
      • Communication systems (what if HipChat/Slack is down?)
      • Network connectivity
      • Access to vendor support
    • When you get an alert:
      • Incident response checklist
      • Log incident in JIRA
      • Log into Ops War Room
      • Public status post
      • Initial investigation, followed by post-mortem
    • Key response principles
      • Log everything
      • Frequent public status updates
      • Gather the team
      • Escalate issues with vendor support

Eight talk: Laura Thomson, Senior Engineering Manager at Mozilla (video)

  • “Many Moving Parts: Monitoring Complex Systems”
  • Laura: “I’m not a sysadmin, I’m a developer”
  • Everyone has experienced: http://cdn.meme.am/instances/500x/22605665.jpg
  • System: Socorro
    • Telemetry system for Firefox
    • Receives data when Firefox crashes
    • Add comments to the crash reports, they get prioritised; “Hi Laura!”
    • Crash stats: https://crash-stats.mozilla.com/home/products/Firefox
    • Use multiple databases to take advantages of different aspects of CAP
      • HBase, PostgreSQL, Elasticsearch
    • 3000 crashes per minute, 3 million per day
    • Crash size 150k - 20MB
    • ~800GB in PostgreSQL
    • Like a firehose
  • Shipping Firefox
    • Complex architecture
    • 4k build and 10k test jobs per day
    • topped 100k/day in August
    • record: 690 commits in a single day in August
    • ~6000 machines
    • 80% of build and 50% of test in cloud
    • all builds using Amazon spot instances, 75% saving in costs
  • How to monitor complex systems
    • Pull data for normal operating systems
    • Push soft alerts for warnings
    • Pages for critical issues
    • Analytics tell you: “Is this load normal?”
      • visualise healthy system performance
      • makes it easier to find anomalies
    • monitoring coverage is logically equivalent to test coverage
      • (and suffers the same issues)
      • easy things to monitor/test versus things we really care about
      • ‘more’ is not better; noise is enemy number one
    • Thomson’s Hierarchy of Monitoring
      • Diagnostic
        • Host DOWN; 500 HTTP errors; replication lag
        • You know where too look; you know what you need to fix; not always simple but well-defined
        • Log monitoring
          • Something is wrong; maybe upstream?
          • Why is it useful?
            • The thing you don’t know how to monitor yet
        • Monitoring connection errors
          • 1% errors are normal; 5% mean something bad is happening
        • Aggregate; show error rates
      • Trend monitoring
        • Disk is 85% full; how fast did it get into that state?
          • Rates of change very important to monitor
      • Performance monitoring
        • e.g. NewRelic, sentry
        • tooling is improving
        • how long does front page take to load?
        • need threshold/trend alerting
        • watch changes following deployments or new versions
        • Transactions/hour, conversion rates; is something broken if there’s an anomaly?
    • Automated monitoring is better than (angry) human monitoring
    • Monitors == tests
      • Run tests on Prod, but be careful (the “Knight disaster”)
      • If you have an outage that wasn’t monitored, add a check (like you’d protect against a bug regression)
    • Gracefully degrade
      • Decouple unreliable components
      • Load/feature shed; temporarily ditch unimportant functionality
      • Automate recovery
    • What changes in larger systems?
      • Thresholds more interesting than single alerts
      • Many alerts should become warnings
      • Avoid alert fatigue

Ninth talk: Ola Gasidlo, Frontend & Javascript Developer at Hood.ie (video)

  • “Say Hello to Offline First”
  • Huge dependency on online applications
    • Need to find ways to stay connected even when offline
  • Global problem; connectivity not omnipresent
  • Need to change the way we build applications
  • Some good offline examples; Google Maps (more recently); Gmail, minutes.io
  • What are the problems we’re facing?
    • We know almost nothing about offline users
  • Offline != error handling; it’s a recurring state
  • Inform user they’re offline?
  • Build trust
    • Trust that the app won’t lose data
      • So the user need not worry about losing connectivity
    • Redesign the save icon
  • Organise data
  • Graceful errors
  • Decide what metrics we base architectural decisions on
  • Before, the server was always the source of trust; client completely dependent
    • Need to decouple this relationship
  • How to handle disconnection
    • Block requests? Ignore request? Do nothing?
    • Use sensible timeouts (no one will actually wait for one minute)
    • Try buffering - flush early to show enough data for end-user to read while the rest loads
    • Hood.ie = open source javascript library
      • Three tiers:
        • Frontend (client-side)
          • Only speaks to hoodie.store
            • Hoodie uses Pouch for storage
          • Do you even need to store data server-side, aside from user credentials?
        • Hoodie Sync
          • Client-side interface between frontend and backend
        • Backend (server-side)
          • REST interface to CouchDB, used by Hoodie Sync when device is online
          • Plugins (node.js)
      • Eventual consistency
      • Application cache manifest
        • Fast
        • Offline-first
        • But application cache is a douchebag
          • Serves cached content even when online
          • Only updaes if manifest changed
          • App cache is an additional cache
          • Never change the manifest URL
            • Ever. User still sees old app manifest
          • Non-cached resources, e.g. images will not load
          • Use appcache-nanny
            • additional cache
            • deals with all requests
            • offline first, push notifications and more
    • First law of offline-first: Don’t lose any data
    • Pouchdb is inspired by CouchDB
  • Offline should be a feature

Tenth talk: Dr. Alistair Hann, CTO at Skyscanner (video)

  • “SomeSQL - Scaling in a Changing World of Data Stores and Hardware”
    • AKA “Using all the databases”
  • Skyscanner: 150 million live searches per month
  • Skyscanner searches and aggregates data then makes it available via APIs
  • When searcing:
    • Which websites should we show? Which will have the most relevant data and good pricing?
    • What pricing do we already have cahced?
    • Live update what we still need
    • Return prices to user
    • Clean up and save new data
  • Were using Microsoft; SQL server
    • Introduced read layer to scale horizontally
    • Started partitioning tables
    • Queueing tables
    • But there was latency: cached result not necessarily available on second request
    • Then added memcached to cache prices
    • Was OK but not sustainable
      • Move from queueing tables to use RabbitMQ
      • Couchbase to cache prices (rather than memcached)
      • Swapped out network storage for SSDs
      • Then removed SQL server; rely entirely on RabbitMQ and Couchbase
  • New car hire service
    • More complicated queries so used map-reduce for those in Couchbase
    • Which car hire companies cater to certain itineraries - lots of aggregation required
  • Hardware landscape changing
    • Flexible flight search == demanding load
    • Limited by IOPS
    • Tried Fusion I/O cards for 0.25M mixed IOPS
    • Too many IOPS, started re-architecting applications to make use of commodity hardware, i.e. Amazon EC2
    • Now using Thrift; Hadooop/Amazon Elastic MapReduce
  • Distributed commit log - see blog from LinkedIn engineer
    • Now using Kafka, which receives raw events from fluentd
    • Searchable using Elasticsearch
  • Timetable data still using relational database; makes sense given data structure
  • Using PostGIS (PostgreSQL) for geo searches
  • Still facing the same challenges
    • RAM and disk I/O concerns
    • Administration (skills)
    • Security (NoSQL, not NoSecurity)
    • Data insertion and retrieval
    • Monitoring and alerting
    • Performance optimisation

Eleventh talk: Monty Widenius, Creator of MySQL and MariaDB (video)

  • “The Current State and Future of MariaDB”
  • Why Maria?
    • Oracle bought MySQL
    • Wasn’t clear what would happen
    • Keep the MySQL talent together
    • To ensure that a free version of MySQL always exists
    • MariaDB Corporation employees +70 people, +20 of which are engineers working on MariaDB
    • MariaDB Foundation to prevent yet another fork (“another child”)
  • Drop-in replacement for MySQL server
  • Enable feedback plugin to help MariaDB know what to focus on
  • 10.1 will include Galera, multi-master replication
  • MariaDB 5.2:
    • SphinxSE text search
    • Virtual columns
    • Extended user statistics
    • Segmented MyISAM key cache
    • Pluggable authentication
  • MariaDB 5.3:
    • Faster subqueries (subquery optimisation)
    • Faster joins; batch key access (BKA)
    • NoSQL support
      • 50% faster HANDLER commands
      • each row can have a different set of columns
  • Support for storage backends in MariaDB 10.0
    • LevelDB
    • Cassandra
    • Connect
    • RocksDB
    • TokuDB
  • Support for thread pools
  • MariaDB 10.0 rather than 5.6
    • MariaDB 5.5 was already a superset of MySQL features
    • Code from MySQL 5.6 not portable to MariaDB (some code too complex to port over or could be done better)
  • Much better GIS support in MariaDB 10.0
  • Global transaction ID in MariaDB 10
    • Can be enabled/disabled independently and online for both masters and slaves
    • Supports multi-source replication