WordPress, Markdown, and HTML – Oh My…

April 18, 2017 Leave a comment

So it’s been a while since I’ve posted something – Basho and Riak have kept me quite busy since I started there. More to come in the near future, but as I was reviewing my old blog posts I realized somehow WordPress had mangled all of my posts, turning them into some hybrid of Markdown and HTML that has ruined most of the code examples. I’ll try to get to updating them soon, and sorry for the inconvenience.

Categories: Uncategorized

IoT Day Norway talk on Elixir and the Internet of Things – 2014

October 7, 2015 Leave a comment

So I can find it more quickly later, my IoT talk from NDC’s IoT Day in Norway is available. This version incorporates some of the conversation Joe Armstrong started in my Strange Loop talk into the presentation itself.

 

Categories: Uncategorized

OS X Yosemite and Max Open Files

October 20, 2014 6 comments

UPDATE

My colleagues at Basho have a better solution to this, which also works with launchd. Use theirs instead of mine!

 

DEPRICATED – SEE ABOVE

A quick note on Ulimits and OS X Yosemite. If you need to increase your ulimit for max open files, some of the old tricks (especially launchctl.conf) no longer work. You’ll need to create a new file,/etc/sysctl.conf if it doesn’t already exist, and add the following lines:

kern.maxfiles=<big number>
kern.maxfilesperproc=<slightly smaller or equal big number>

`kern.maxfiles` determines the maximum number of files available to be open across the entire system. `kern.maxfilesperproc`, not surprisingly, sets the limit for a single process.

Save the file, and reboot for good measure, and you should now be able to set a per process max open files greater than the default 10240. Unfortunately, /etc/launchd.conf appears to be ignored now due to ‘security concerns,’ so you’ll have to add a line to your .bashrc or .zshrc (or your .rc) to up your ulimit if you don’t want to do it every time you start up whatever program you have that needs it:



ulimit -n 32768

As I’m working on Riak, which requires many more open files than the defaults. Per the Riak Docs, I’ve set mine to:


kern.maxfiles=262144
kern.maxfilesperproc=32768

This should support an 8-node cluster’s worth of open files, which is way more than I’ll ever actually run on this machine. Mind you, if you’re attempting to run a very large cluster, or a cluster with many, many, keys, you may need to tune this further. In other words, YMMV.

Categories: Mac OS X, Riak Tags: , ,

Strangeloop Elixir and the IoT talk video available

October 20, 2014 Leave a comment

I love Strangeloop’s videos, as they capture the screen separately from the speaker and then merge the two together nicely. My talk on Elixir and the Internet of Things from Strangeloop 2014 is available:

Categories: Uncategorized

Video from my Dayton Elixir talk on IoT available.

August 19, 2014 Leave a comment

For those who would like to hear my talk on Elixir and the Internet of Things, it’s available on YouTube:

Categories: Uncategorized

A Gentle Introduction to OTP

April 24, 2014 Leave a comment

Supervisors, Workers, and Trees – An Overview

Open Telecom Platform, or OTP is a group of Erlang libraries that help implement distributed, fault-tolerant systems. Because of the patterns that are encouraged by the OTP libraries, these systems tend to be built of lots of small, independent processes that are easy to reason about and manage. And while the name implies that they are only useful for Telecom applications, many types of applications can be built using OTP.

At its simplest, an OTP application, at runtime, is nothing but a tree of Erlang processes. There are two fundamental kinds of OTP processes (implemented as Erlang behaviors). The first is the Supervisor, the job of which is to start up its children manage their lifecycle. The other is the Worker (of which there are several sub-types). Workers, as the name implies, do the work of the system.

I ended my third installment on Elixir and the Internet of Things with a picture of the Supervisor Tree we’ve built for our SSL RabbitMQ endpoint host. As a reminder, it looks like this:

Owsla Supervisor Tree

For those of you who are familiar with OTP, this diagram probably already makes sense. If you’re not, follow along and I’ll try to make some sense of this slightly different version of a typical “circles, boxes, and lines” diagram.

Supervisors

Supervisors do nothing but start, monitor, and (potentially) restart their children. While this sounds like a small thing, supervisors are, in may ways, what give OTP applications their reputation for stability and resilience in the face of errors. I’ll go into more details about supervisors, supervision strategies, and restart settings in a later post, but for now understand that each supervisor can be configured to do start up different kinds of children and react differently when those children die.

Workers

As their name implies, OTP workers perform the actual work of the application. There are three generic worker behaviors provided by the OTP libraries, the gen_server, gen_fsm, and gen_event, each of which is implemented as a behavior in Erlang parlance.

gen_server (Generic Server)

The most generic of the three is the gen_server behavior. This behavior implements a typical Erlang event loop and provides callbacks for a generic server. However, it doesn’t inherently implement any sort of networking server (as its name may imply) beyond what any Erlang process provides. It’s really a host for fairly generic business/communications logic. You may, however, use this to create things like our TcpEndoint from the Owsla project, where the generic gen_server happens to “own” a TCP connection and receives and reacts to events from that connection. Rather than repeat myself, I’ll refer you to the 2nd part of my series on Elixir and the Internet of Things, where I break down our TcpEndpoint worker in great detail.

gen_fsm (the Finite State Machine)

Does anyone remember what Finite State Machines (FSMs) are? You know, those little circle->arrow->circle diagrams that turn into regular expressions, or something like that? At least, that’s what I remember from my CS class in college. Seriously though, FSMs are useful for situations where you have logic that maps to a series of states and transitions. A small example from the wikipedia article on Finite State Machines will give you the idea of what I’m talking about. This one models a simple coin-operated turnstyle:

Coin Operated Turnstyle FSM

These kinds of workers are often used to model real-world devices (like the above-mentioned turnstyle).

	-module(turnstyle).
	-behaviour(gen_fsm).

	-export([start_link/1]).
	-export([coin/0]).
	-export([push/0]).
	-export([init/1, locked/2, unlocked/2]).

	start_link() -&gt;
	    gen_fsm:start_link({local, turnstyle}, tunrstyle, {}, []).

	coin() -&gt;
	    gen_fsm:send_event(turnstyle, {}).

	init() -&gt;
	    {ok, locked, {[]}}.

	locked({coin}, {}) -&gt;
	       do_unlock(),
	       {next_state, unlocked, {[], }, 30000}.

	locked({push}, {}) -&gt;
	       {next_state, locked, {[], }, 30000}.
	       
	unlocked(timeout, {}) -&gt;
	    do_lock(),
	    {next_state, locked, State}.
	    
    unlocked({push}, {}) -&gt;
    	do_rotate(),
    	{next_state, locked, State}.

You may have noticed that this is straight-up Erlang code. This is because, as of Elixir 0.12.5, the GenFSM.Behavior module was deprecated, and it was removed in 0.13. The reason given is that there are easier and more idiomatic ways in Erlang to implmement a finite state machine, and that those solutions (like) plain_fsm probably provide a better basis for a Finite State Machine implementation in Elixir. Also, please pardon what I am sure are the many typos in my Erlang code (I compiled it in my head and it seemed to work fine).

gen_event (Event Handlers)

Event Handlers, as implemented by the gen_event behavior, work in partnership with an event manager to separate the responsibility of raising events and handling them in different ways. An event manager may have zero, one, or several event handlers registered, and each will respond to an event sent to its event manager. Let’s implement the console logger from the erlang docs in Elixir:

	defmodule ConsoleLogger do
		use GenEvent.Behaviour
		
		def init(_) do
			{ :ok, [] }
		end
		
		def handle_event(error, state) do
			IO.puts &quot;***ERROR*** #{error}&quot;
			{:ok, state}
		end
	end

Now, you can start up a generic “event manager”, register our handler with it, and send it events:

	# start up our event manager
	:gen_event.start({:local, :error_manager})
	# add our handler
	:gen_event.add_handler(:error_manager, ConsoleLogger, [])
	# Now, send an event
	:gen_event.notify(:error_manager, :error)

which would result in an IEX session something like this:

	iex(2)&gt; :gen_event.start({:local, :error_manager})
	{:ok, #PID&lt;0.59.0&gt;}
	iex(3)&gt; :gen_event.add_handler(:error_manager, ConsoleLogger, [])
	:ok
	iex(4)&gt; :gen_event.notify(:error_manager, :error)
	:ok
	***ERROR*** error
	iex(5)&gt;

Note that you can register additional event handlers, each of which will be notified whenever an event is sent to the event manager. This is great when you want to separate out different concerns (logging to the console, log files, database tables, and airbrake.io, for example) and simply register the event handler(s) you need at the time.

Conclusion

This only scratches the surface of OTP, but hopefully it will give you a starting point for understanding how some of the moving parts fit together. In a future post, I’ll go into more detail on building OTP applications, supervisor trees and strategies, why our system kept imploding every time we disconnected more than 5 TCP/IP connections in a short period of time, and how you can avoid having the same thing happen to you.

Categories: Elixir, Erlang

Elixir and the Internet of Things (Part 3) – Rabbits, Rabbits Everywhere.

March 6, 2014 1 comment

(Cross-posted from neo.com, somewhat belatedly)

In honor of Erlang Factory 2014 (where I’ll be soaking much Elixir and Erlang knowledge today and tomorrow), I’d like to present the 3rd and final installment in my series on using Elixir when dealing with the Internet of Things.

In part 1 of this series, I discussed the stampede of connections issue we faced when working with Nexia Home on their home automation system, and what (Ruby) solutions we tried before landing on Elixir. Additionally, I explained how we backed into implementing an acceptor pool in order to handle 1000 new SSL connections/second from home automation bridges and other devices.

In part 2, I discussed the implementation of our TcpEndpoint process, and how it routes incoming, line-based TCP/IP protocols to a RabbitMQ queue for processing by our back-end systems, but I left the implementation of our RabbitProducer and RabbitConsumer for later.

So it appears we’re at now now.

Let’s start by taking a look at the Rabbit Producer. We’re using the ExRabbit Elixir rabbit client (forked from). The producer is a single Erlang process that simply pushes data to RabbitMQ. So far, it seems that this is fast enough to handle many thousands of messages per second, so we haven’t done any additional tweaking of this code.

     defmodule Owsla.RabbitProducer do
      use GenServer.Behaviour

      defrecord RabbitProducerState, amqp: nil, channel: nil, config: nil

      def start_link(config) do
        :gen_server.start_link({ :local, :producer }, __MODULE__, config, [])
      end

      def init(config) do
        amqp = Exrabbit.Utils.connect(host: config.rabbit_host,
                                      port: config.rabbit_port,
                                      username: config.rabbit_user,
                                      password: config.rabbit_pass)

        channel = Exrabbit.Utils.channel amqp
        queue = Exrabbit.Utils.declare_queue(channel, config.from_device, auto_delete: false, durable: true)

        {:ok, RabbitProducerState.new(amqp: amqp, channel: channel, config: config) }
      end

      def send(connection_id, message, from_device_seq) do
        :gen_server.cast(:producer, {:send, connection_id, message, from_device_seq})
      end

      def handle_cast({:send, connection_id, message, from_device_seq}, state ) do
        props = [headers: [{"connection_id", :longstr, connection_id}, {state.config.from_device_seq, :long, from_device_seq}]]
        Exrabbit.Utils.publish(state.channel, "", state.config.from_device, message, props)
        {:noreply, state }
      end

      def terminate(reason, state) do
        IO.puts "Rabbit producer terminating. Reason was:"
        IO.inspect reason
        Exrabbit.Utils.channel_close(state.channel)
        Exrabbit.Utils.disconnect(state.amqp)
        :ok
      end
    end

In init, we create a rabbitmq channel, declare our queue, and return a new record that stores our amqp connection and channel, along with some configuration information. A call to the send helper function will send a message to the locally registered producer process with the relevant data. Note that the TcpEndpoint maintains an outbound and inbound message sequence, and the from_device_seq parameter to send is that sequence number.

The single handle_cast receives the send message, adds two message headers for the connection ID and the previously-mentioned sequence number, and then publishes the message to a queue named in our configuration object.

There’s nothing really magical going on here, and the RabbitConsumer isn’t much different. However, we are leveraging another part of ExRabbit for the RabbitConsumer, which is the ExRabbit.Subscriber module.

    defmodule Owsla.RabbitConsumer do
      use Exrabbit.Subscriber

      def start_link(config) do
        :gen_server.start_link(Owsla.RabbitConsumer,
                          [host: config.rabbit_host,
                           port: config.rabbit_port,
                           username: config.rabbit_user,
                           password: config.rabbit_pass,
                           queue: config.to_device],
                          [])
      end

      def handle_message(msg, state) do
        case parse_message_with_props(msg) do
          nil -> nil
          {tag, payload, props} ->
            conid = get_header(props.headers, "connection_id")
            to_device_seq = get_header(props.headers, "to_device_seq")
            Owsla.TcpEndpoint.send(conid, to_device_seq, payload)
            ack state[:channel], tag
        end
      end

      def get_header(nil, _key), do: nil
      def get_header(:undefined, _key), do: nil
      def get_header([], _key), do: nil
      def get_header(headers, key) do
        [head | rest] = headers
        case head do
          {^key, _, value} -> value
          {_, _, _} -> get_header(rest, key)
        end
      end
    end

So, we start up our RabbitConsumer with the same kinds of configuration data we used for the RabbitProducer in the start_link function. The ExRabbit.Subscriber module does most of the hard work of subscribing to the queue we provided in the start_link call, and all we have to worry about is reacting to the handle_message function, which is called for us whenever we receive a message. Unfortunately, the headers property of the message properties is a list of tuples in the form of {key, type, value}, and not something useful like a HashDict, so we have some utility methods to walk the list and find the headers we need (namely, the connection ID and the sequence number from the incoming message) and then we just call the TcpEndpoint.send function.

If you remember from part 1, whenever we create a new TcpEndpoint process we register it’s PID in the global registry, which allows us to easily find the connection regardless of what node in our cluster it’s on. Note that we start up 8 (configurable) RabbitConsumers, and do some message ordering magic in the TcpEndpoint to make sure we’re delivering messages to the devices in the correct order.

And that’s pretty much it, round-trip from device -> backend -> device. For those who have been following along, our supervisor tree ends up looking like this:

Owsla Supervisor Tree

Categories: Elixir, Performance