An IRC discussion that largely characterizes what Runnels is and how it will work:
(13:12:06) f3ew pokes lkanies(13:12:15) lkanies: eh? don’t poke me
(13:12:43) laen: Man, I leave for 18 hours, and I miss a lot.
(13:12:43) f3ew: Oh, I just read the mail thread on the message bus stuff
(13:13:15) lkanies: laen: doesn’t seem like that much went on in the meantime…
(13:13:21) lkanies: f3ew: what do you think? interested?
(13:13:36) lkanies: have any opinions on what we should use?
(13:13:36) f3ew: As I see it, you have three issues: The message format, the message envelope and the message transport
(13:13:42) lkanies: ayup
(13:13:52) f3ew: I would love SMTP, but it isn’t bi-directional
(13:14:07) f3ew: a) Fix the message format
(13:14:13) f3ew: b) Fix the envelope
(13:14:23) f3ew: c) The transport can be anything
(13:14:24) icblenke: well, it can be, kinda. ETRN and all.
(13:14:35) f3ew: icblenke not always
(13:14:58) f3ew: My point is, I should be able to send a message over SMTP and get a response over usenet
(13:15:20) icblenke: a response over usenet. that would be.. interesting. ;)
(13:15:23) f3ew: I should be able to send out a message over a HTTP POST and have a config change delivered via Jabber
(13:15:53) icblenke: totally agree
(13:15:55) lkanies: yeah, that’s a good point
(13:16:44) f3ew: Keep in mind that the message envelop is not necessarily the transport envelop
(13:17:22) lkanies: yeah, i’ve been thinking about that, too—that you’d want the "standard" to include the ability to include multiple objects, which implies talking about the envelope
(13:18:04) f3ew: <transport><message envelope><message></message></message envelope></transport>
(13:18:58) f3ew: What the message envelope should be able to do is provide AAA enforcement
(13:19:36) lkanies: hmm
(13:19:44) lkanies: i haven’t thought much about that
(13:20:02) lkanies: at the least i’ve really only been worrying about proving authenticity, rather than restricting access
(13:20:19) lkanies: because the whole idea here is that you just dump your data into the stream and anyone who wants can pick it up
(13:20:43) f3ew: Well, authorisation could be policy dependent
(13:20:53) f3ew: We should not try to set policy
(13:20:58) lkanies: what would an auth model look like? would the message generating nodes need to know which nodes they were authorizing? or would they just be authorization domains or something?
(13:21:16) f3ew: Default policy could be to allow anyone who can authenticate access, but that need not always be the case
(13:21:43) f3ew: I am looking at node <
(13:22:06) f3ew: You already do something similar for puppet with the public key stuff
(13:27:17) f3ew: Am I making sense?
(13:27:19) spike is doing the same thing
(13:27:25) spike: with my router msg thingie I mean
(13:27:35) spike: don’t get what’s the point of the third A up there tho
(13:27:36) f3ew: spike yes
(13:27:43) f3ew: accounting
(13:27:44) spike: or rather, I don’t do/want to do accuonting
(13:27:59) spike: yeah, knew the meaning, just don’t see the point of doing that in such scenario
(13:28:26) f3ew: even if it means just minimal logging of the sort "receievd message M with ID I over transport T: validated"
(13:28:40) lkanies: sorry, was on the phone
(13:28:44) f3ew: np
(13:28:52) icblenke: I really don’t like microformats.org, mostly because you’re using HTML tags to separate data elements in a way that doesn’t semantically describe what the data represents.
(13:29:02) lkanies: accounting? i thought it was access, authentication, and authorization
(13:29:03) spike: oh, well, I do that, but consider it "logging".
(13:29:15) f3ew: I am actually looking at this as a place where XML might be useful
(13:29:16) spike: lkanies: nope, Authentication, Authorization and Accounting
(13:29:23) spike: that’s how u expand AAA
(13:29:26) icblenke: RDF appeals to me, if we really must XML.
(13:29:31) lkanies: that’s not how the ldap guys talk about it
(13:29:36) f3ew: Who you are, Ehat you can do and What you did
(13:29:43) f3ew: What
(13:29:46) lkanies: or at least, that’s the way it was years ago
(13:29:49) lkanies: when i cared :/
(13:29:53) f3ew: lkanies networking term
(13:30:06) spike: icblenke: what’s wrong with xml?
(13:30:13) lkanies: then how do you classify ip-based access controls?
(13:30:25) f3ew: lkanies authorization
(13:30:34) spike: yep
(13:30:44) icblenke: spike: I’m more of a YAML mindset kinda guy. Even raw key/value would make me happy.
(13:30:45) lkanies: but authorization is after authentication
(13:31:01) lkanies: and ip based access control is before it
(13:31:19) spike: lkanies: without knowing who u’re talking to, how do you define what u wanna give it access to?
(13:31:41) lkanies: spike: because all network connections necessarily have an ip address, and many have host names
(13:31:51) lkanies: so the first line of control is whether those are allowed at all
(13:31:54) f3ew: lkanies not necessarily
(13:32:06) lkanies: f3ew: so you’re talking about access to the stream itself, not access to the items on the stream?
(13:32:14) lkanies: how would access be controlled?
(13:32:20) f3ew: lkanies that is just a rule that says that regardless of authentication, deny access
(13:32:38) lkanies: i mean, would you encrypt the messages so only certain keys could read them?
(13:32:55) f3ew: actually, I would just want to validate the messages
(13:33:02) icblenke: wouldn’t using the client/server SSL keys suffice? basic public key cryptography?
(13:33:03) lkanies: no, it’s different: IF the ip is allowed, THEN authenticate; IF authentication passes, THEN verify authorization
(13:33:11) spike: icblenke: oh, YAML, havent found the time to look into that yet. would you mind commenting pros of YAML over XML?
(13:33:19) f3ew: so that a random stranger on the web would not be able to send in an invalid message
(13:33:35) icblenke: spike: simplicity. primarily. human readable. easy to do in ruby.
(13:33:36) lkanies: ok; so you’re not so much thinking of restricting access to the logs
(13:33:38) f3ew: lkanies that is a question of order and optimising checking order
(13:33:47) lkanies: f3ew: not in any system i’ve seen
(13:33:55) lkanies: no firewall i’ve ever seen does authenticated ip blocking
(13:34:05) f3ew: lkanies It doesn’t matter
(13:34:12) lkanies: it just says, "oh, you’re sending traffic from X? Bzzzzt"
(13:34:22) lkanies: i agree, i guess
(13:34:34) f3ew: Your policy is: regardless of authentication status, disregard any connection from <ip block>
(13:34:51) lkanies: let’s just drop it
(13:34:52) spike: lkanies: in firewall realm, IP is a mean of authentication, that’s how you read it imho
(13:34:53) f3ew: So we don’t bother about authentication, since we don’t care about the result
(13:34:56) spike: ehehe, k
(13:35:12) spike drops it on f3ew’s feet
(13:35:30) f3ew drops spike into /dev/null
(13:35:51) spike equips himself for a loooooong trip
(13:35:56) icblenke: I’m partial to an encryption scheme per message to solve this problem.
(13:36:16) lkanies: icblenke: that suddenly seems not as simple
(13:36:29) icblenke: it also imposes some overhead.
(13:36:40) f3ew: icblenke If we have keys in place, encryption can be added later. Without having that in the design, modifications will be harder
(13:36:45) lkanies: p2p encryption is easy; stream-based encryption, especially when you literally don’t even know who you’re allowing access to, is less so
(13:36:51) lkanies: yeah
(13:36:58) lkanies: let’s start with keys, and signing, and leave it there for now
(13:37:05) spike: damn, gtg. it would be very useful if this discussion was moved to ml… so I could read it
(13:37:06) f3ew: Once we figure out the issues, life can get easier
(13:37:11) f3ew: Is anyone logging this?
(13:37:15) spike: or get logged somewhere… I’m definitely interested
(13:37:20) f3ew: heh
(13:37:23) spike: :)
(13:37:24) icblenke: once you have a symmetric key, the overhead isn’t nearly as bad. using asymmetric crypto per message would be mucho overhead.
(13:37:41) f3ew: icblenke lets get something in there right now
(13:37:45) icblenke: logging on.
(13:37:59) f3ew: and copy what happened earlier into a file please
(13:38:11) lkanies: i can post all of the logs from yesterday and today
(13:38:11) spike: ok, c you tomorrow guys, have a nice ‘eve whatever is there.
(13:38:14) lkanies: spike: laters
(13:38:43) spike left the room (quit: "gone").
(13:41:40) lkanies: ok
(13:41:59) lkanies: so let’s start with format
(13:42:08) lkanies: i think that, frankly, the format doesn’t really matter that much
(13:42:18) lkanies: we can change our minds 2 months in, because it’s all pretty simple
(13:42:35) lkanies: i’ll just have to_picoformat methods or whatever, and i can just write new methods pretty easily
(13:42:46) lkanies: i’m sure we all agree it will be text and will be human readable
(13:42:51) f3ew: yes
(13:42:54) lkanies: beyond that, the details just don’t matter that much
(13:43:01) f3ew: and easily machine parsable
(13:43:02) lkanies: and i’m sure we all agree it will basically be key/value pairs
(13:43:04) lkanies: yeah
(13:43:56) f3ew: I like the idea of key/value pairs, but those might not suffice if things start to get complicated
(13:44:05) lkanies: so, the questions i have are, will the messages support including other messages? what metakeys will we have (e.g., format type, format version, signature, source, destination, etc.)?
(13:44:10) lkanies: i think they’ll be fine
(13:44:24) lkanies: they work fine for ldap and i’ve done some ridiculous things there
(13:44:41) lkanies: anyone else used ldap much? specifically, the inheritance and classing stuff?
(13:44:46) icblenke: copied the scrollback. whew
(13:45:36) icblenke: unfortunately, not really.
(13:45:38) lkanies: icblenke: i just saved the whole thing in html
(13:45:56) lkanies: i’ve got scrollback from, um, a long time ago, so i figured i’d just post the whole thing when were done
(13:46:00) f3ew: me
(13:46:11) lkanies: ok, i’ll give a quick breakdown
(13:46:27) lkanies: basically "objectclasses" in ldap are just a named collection of mandatory and optional fields
(13:46:52) lkanies: fields have certain metadata associated: mostly you care about whether they can be single or multivalue, but there are a few other things
(13:47:14) lkanies: you can subclass a given objectclass, which just means that you want to add fields to an existing class
(13:47:22) lkanies: subclasses cannot remove or modify existing fields
(13:47:29) lkanies: it’s stupid-easy
(13:48:10) lkanies: this is one big difference, i think, between what we want and the microformats guys: they are prioritizing on world-wide exchange, and we only need to prioritize on per-site exchange
(13:48:24) f3ew: per node exchange
(13:48:31) lkanies: and i can nearly guarantee that people are going to want to create custom classes that aren’t much use outside their own sites
(13:48:43) lkanies: what i mean is, people will very rarely be exchanging world-wide log or metric data
(13:48:59) f3ew: Actually, I would like to see per site software config management classes
(13:49:02) lkanies: and the ones that do happen to get exchanged world-wide (like ganglia is doing) will naturally coalesce into a standard
(13:49:07) f3ew: but we are so not going the OID route
(13:49:11) lkanies: no
(13:49:14) icblenke: Is an endpoint globally unique? I like the idea of using GUIDs.
(13:49:14) lkanies: no stinking oids
(13:49:23) lkanies: what do you mean?
(13:49:33) f3ew: icblenke not necessarily
(13:50:01) f3ew: Globally Unique IDs, 128 bit hashes of mac address, ip address, and a few other host parameters
(13:50:18) icblenke: also, timestamp.
(13:50:52) f3ew: yes]
(13:50:57) lkanies: yeah, global message ids are a good idea
(13:51:15) lkanies: anyone have a wiki up already?
(13:51:34) icblenke: it makes routing things a bit easier, in that there should be no collisions.
(13:51:39) lkanies: i can start writing this up, but a wiki seems like a better idea
(13:52:10) icblenke: yep. a wiki would be ideal.
(13:52:39) icblenke: be back soon. need to jet for a bit.
(13:52:45) lkanies: hmm, ok
(13:55:50) lkanies: i think i’m going to take this opportunity to walk to a coffee shop, then
(13:55:56) lkanies: and i can continue the conversation from there
(13:57:08) f3ew will go to bed
(13:57:17) f3ew has been busy at conference
(13:57:31) f3ew: Oh, and can someone please record the Lisa stuff?
(13:57:47) lkanies: i’ll do my best to get a record
(13:57:59) lkanies: but since you and icblenke seem to be the best contributors and neither will be there..
(13:58:01) f3ew: ty
(13:58:03) lkanies: ok
(13:58:32) f3ew: video would be nice, particularly with good audio and whiteboards captured
(14:08:01) icblenke: ok, I just noodled this a bit on the drive over.
(14:08:22) icblenke: everything we’re talking about so far leads me right back to Jabber/XMPP
(14:09:02) f3ew: icblenke those are transports
(14:09:08) icblenke: we need authentication and encryption. jabber has authentication and SSL/TLS for encryption.
(14:09:15) icblenke: XMPP has publish/subscribe.
(14:09:18) f3ew: yes, but that is transport
(14:09:36) icblenke: ah. I see what you’re saying. I agree then.
(14:10:01) icblenke: what we need to agree on, then, is that our message bus relies on the transport for end-to-end security and authentication.
(14:10:09) icblenke: keep the message bus simple.
(14:10:16) f3ew: for end to end security, yes
(14:10:34) icblenke: the message bus needs some kind of "command channel" to "subscribe" or "publish" to some endpoint.
(14:10:34) f3ew: authenticating the message contents itself, not necessarily
(14:10:55) f3ew: the message bus config is handled via puppet :)
(14:11:08) f3ew: We can do it inline
(14:11:26) icblenke: authenticating the message contents itself would be handled best by something like SDSC secure syslog’s "syslog-reliable" (RFC 3195) stuff.
(14:11:48) icblenke: http://security.sdsc.edu/software/sdsc-syslog/
(14:12:04) f3ew: icblenke that was what the whole signature thing above was about
(14:12:15) icblenke: missed the signature bit somehow.
(14:12:26) f3ew: you wanted encryption in there
(14:12:43) icblenke: understanding the problem, that seems like a transport issue now.
(14:12:51) f3ew: right :)
(14:13:06) f3ew: What I want to know is that the message has not been MITM’ed
(14:13:12) f3ew: or corrupted
(14:13:19) icblenke: wouldn’t the transport handle that?
(14:13:48) f3ew: multi-hop routers?
(14:13:59) f3ew: Don’t assume a single router
(14:14:14) f3ew: but transport is only per pair of endpoints
(14:14:40) icblenke: a transport should reliably deliver a message in its entirety, in sequential order, no?
(14:15:12) f3ew: but it does that to the next hop
(14:15:15) icblenke: each leg of the route would need that guarantee.
(14:15:23) f3ew: yes
(14:15:54) f3ew: and having the last thing check a signature should have a second test built in to ensure that things work
(14:16:15) f3ew: You know, we are reinventing the internet here :)
(14:16:42) laktop [n=luke@c-69-180-218-163.hsd1.tn.comcast.net] entered the room.
(14:17:00) icblenke: if the transport is reliable, why check at all? seems like unneccesary overhead.
(14:17:25) f3ew: because the intermediate hosts may not be
(14:18:05) icblenke: lets talk about segmentation and reassembly: where would the resend requests be initated from? the endpoint queue runner?
(14:18:25) icblenke: so we’re looking for publish, subscribe, and resend messages in the control channel now.
(14:18:45) laktop: f3ew: i thought you were going to bed
(14:18:56) f3ew: node A -> router A, validate and possible reassembly -> router B -> router C -> node B (validate)
(14:19:01) laktop: and how the heck do i do tab completion in x-chat?
(14:19:08) f3ew: f<tab>
(14:19:22) f3ew: type the first character
(14:19:22) laktop: nope, it just selects everything i’ve typed
(14:19:32) laktop: oh well, must be an xchat-on-osx problem
(14:19:45) icblenke: segmentation/reassembly of a message would be a transport thing. I’m thinking streams to messages.
(14:20:01) laktop: why would you even need to worry about segmentation?
(14:20:09) laktop: i obviously missed a bit, sorry
(14:20:25) icblenke: that would be an agent thing then. no worries.
(14:21:19) laktop: can someone explain to me what i missed in the 7 minute walk here?
(14:21:40) icblenke: a message gets sent, makes it to a router, and the router crashes taking the message with it. when it comes back up, the sequence is interrupted, and the sending end times out waiting for the next message in the sequence.. it needs to ask the sending agent for the message again.
(14:21:53) f3ew: laktop got that?
(14:21:58) laktop: ok
(14:22:22) laktop: it already feels like this is getting too complicated
(14:22:29) icblenke: we’re really talking about a "stream" at this point. ala TCP.
(14:22:34) laktop: but that does seem like an important thing
(14:22:45) icblenke: reliable sequential delivery of messages.
(14:22:52) laktop: OTOH, that means that sequence numbers have to map directly to subscriptions
(14:23:20) laktop: that is, if host A subscribes to all log messages from puppet, then there needs to be a clear way for host B to sequence its puppet log messages such that host A always knows if it got everything
(14:23:51) icblenke: if host B is missing something, it needs a way to ask host A to resend it.
(14:23:53) laktop: and i can’t always get more specific: only subscribe to puppet error messages from the DNS class
(14:24:07) f3ew: Uh, no
(14:24:08) icblenke: this also means that host A needs to keep some buffer of messages around to be resent.
(14:24:10) laktop: but generally, sequence numbers will be the only way for it to know it missed something
(14:24:29) f3ew: oh, sorry, I misread
(14:24:31) laktop: f3: what do you mean, no
(14:24:58) f3ew: I read that as hosts A and B sending messages to the router and needing to sync
(14:25:10) laktop: ah; ok
(14:25:20) icblenke: is a router aware of subscriptions?
(14:25:24) f3ew: icblenke I want to be able to ignore transport issues right now
(14:25:38) icblenke: but it’s not a transport issue, really.
(14:25:42) f3ew: Well, a router pushes stuff somewhere else
(14:25:57) icblenke: a transport issue would be between each leg of the route through various routers
(14:25:59) f3ew: the why is not important
(14:26:02) laktop: i think that, by far, the default case is that messages will be independent of each other other than time
(14:26:09) f3ew: yes
(14:26:14) laktop: i would say that the router is exactly where you would have your subscriptions
(14:26:35) laktop: so let’s focus on the why, and if we simply guarantee message delivery in order by time, then we’ve got that
(14:26:45) laktop: rather, we’ve got the majority of cases
(14:27:04) icblenke: we dont want to miss messages though. is transport enough to guarantee that end to end?
(14:27:16) f3ew: laktop think of it like this: The primary job of the router is to push messages. It does this by looking at a routing table. We can have local config entries (static routes/push) and externally learnt routes (subscriptions)
(14:27:29) laktop: well, we’ll be using tcp end to end, and then each daemon is responsible for sending out each message it receives
(14:27:35) icblenke: is it ok for a router to "go down" and take messages with it?
(14:28:06) f3ew: icblenke that is why we queue
(14:28:19) laktop: f3ew: yeah, that makes sense, so the router needs subscription methods and static route configs
(14:28:28) laktop: icblenke: yeah, i think that’s basically a solved problem
(14:28:55) laktop: store queued messages on disk, and don’t delete them from the queue until you’ve confirmed they’re correctly sent
(14:28:59) icblenke: so an agent on host A sends out a "publish" message to the router for some "resource namespace"
(14:29:05) laktop: of course, "sent" is weird here, because it could be to tons of hosts
(14:29:16) f3ew: laktop but that management has nothing to do with the "router" part being aware of transports. It doesn’t care about why, only how
(14:29:17) icblenke: and an agent on host B sends out a "subscribe" message to the router for that resource name
(14:29:27) f3ew: laktop SMTP
(14:29:43) f3ew: We have a respose code which indicates message accepted
(14:29:51) f3ew: icblenke BGP
(14:29:55) laktop: f3ew: i didn’t mean to imply it wasn’t solved, just that it’s a slightly more complicated version
(14:30:04) f3ew repeats his assertion about reinventing the Internet
(14:30:18) laktop: yeah, i share the concern
(14:30:30) f3ew: We need to back down
(14:30:35) laktop: that’s why i’m much more interested in something that works a little right away than something huge
(14:30:50) f3ew: What do we want/need?
(14:31:00) f3ew: want: AAA
(14:31:07) f3ew: need: authentication
(14:31:12) laktop: ah, ok
(14:31:28) icblenke: I need to transport events: syslog, netflow, stats, etc, reliably, from collectors to a central aggregator.
(14:31:47) laktop: i need to subscribe to events from other nodes
(14:31:57) laktop: and i need those events to be relatively reliably sent to me
(14:32:14) laktop: 99.9% or whatever; i don’t mind missing a few here and there
(14:32:38) laktop: i need to subscribe to event types
(14:33:02) icblenke: I’d like nagios alerts as well. Take all of those alert event sources, and run them through something like SEC to alert me when something goes wrong. Take all of the stats messages and process them on a stats server for display.
(14:33:07) laktop: i want to subscribe to objects based on field values and substrings, e.g., "field X =~ /dns/i"
(14:33:24) laktop: icblenke: can you restate that?
(14:33:43) icblenke: as admins we have a number of basic problems.
(14:33:48) icblenke: monitoring and alerting.
(14:33:59) f3ew: icblenke nagios would be in the etc
(14:33:59) icblenke: statistics and trending.
(14:34:05) laktop: logging
(14:34:06) f3ew: security
(14:34:06) icblenke: right.
(14:34:34) icblenke: I’d like to be able to have a nagios alert event trigger a ticket getting opened in RT
(14:34:42) icblenke: well.. more specifically..
(14:34:43) laktop: ok
(14:35:00) icblenke: I’d like to be able to have SEC process a Nagios alert, and then have it send a message to have a ticket opened in RT.
(14:35:02) laktop: that’s easy: you write a custom node that subscribes to nagios alerts and knows how to open RT tickets
(14:35:07) f3ew: Ok, what we want is an interface to accept events, where an event is defined as a message from an external service
(14:35:13) f3ew: laktop hang on
(14:35:17) laktop: ...?
(14:35:22) f3ew: Lets define the problem scope first
(14:35:29) f3ew: then we look at solutions
(14:35:46) laktop: ok
(14:35:52) icblenke: I’d also like to transport syslog, netflow, SAR, etc data over a bus to an aggregator that can provide a central location for stats viewing.
(14:36:06) f3ew: Once the event is accepted, we convert it to a standard format.
(14:36:24) laktop: icblenke: do you want to statically route them there, or do you want the aggregator to subscribe to them?
(14:36:27) f3ew: This is then sent to a number (one or more) of actors
(14:36:44) icblenke: I would also like to start pushing configuration events to servers from a central repository.
(14:36:46) f3ew: Each actor can do different actions
(14:36:51) f3ew: icblenke please!
(14:36:51) laktop: f3ew: i’m not convinced that we should accept events in anything other than our standard format
(14:37:06) f3ew: laktop at the router, yes
(14:37:09) laktop: hmm
(14:37:18) f3ew: but you still want a conversion utility somehwere
(14:37:38) icblenke: the agents that publish some resource should have a format that subscribers are aware of.
(14:37:45) laktop: ok; we talk about how it would be done, and maybe have some example implementations, but don’t consider it to be part of the main runnels core
(14:38:13) f3ew: event generator
(14:38:20) f3ew: right
(14:38:23) laktop: yeah
(14:38:36) f3ew: acceptor and convertor are application/input format specific
(14:38:38) laktop: and event generators obviously could just convert and send them directly
(14:38:49) f3ew: right
(14:38:51) laktop: which, i expect, is the ultimate goal
(14:38:56) f3ew: Not out headache
(14:38:59) f3ew: our
(14:39:02) laktop: and certainly is what puppet will do
(14:39:03) laktop: yeah
(14:39:31) f3ew: once it gets to the router, we need to be able to figure out where the message needs to go, and send it to all of them
(14:39:32) laktop: one would expect that part of the overalll package will include a daemon that does tons of different conversions in either direction
(14:39:36) laktop: yeah
(14:39:41) f3ew: this needs to get acks from each node
(14:39:55) icblenke: eventually.
(14:39:58) laktop: does it need acks? for every message?
(14:39:59) f3ew: laktop or just a bunch of scripts
(14:40:28) f3ew: an ack can be just a simple numeric code
(14:40:30) icblenke: I thought the transport was going to handle that?
(14:40:33) laktop: f3ew: i’m fine not talking about implementations, but i don’t think it’ll be too hard to have a module interface that, say, makes it easy to inter-convert any messages of the same type
(14:41:15) laktop: i actually already know how i’d do it in ruby
(14:41:16) f3ew: laktop yes. but if you have a structured format, you would need heavy duty scripting to convert syslog to that
(14:41:47) f3ew: Also, the convertor signs the message before sending it to the router
(14:42:09) laktop: okay, so our messages need to be signed, or we prefer they’re signed?
(14:42:14) icblenke: why not make a "convertor" just another agent?
(14:42:21) laktop: yeah, that’s what i’m thinking
(14:42:35) icblenke: an agent that knows how to subscribe to one format and output another.
(14:42:43) f3ew: icblenke yes
(14:43:06) f3ew: but for our immediate convenience, I would like to describe them slightly differently
(14:43:21) laktop: ok
(14:43:31) f3ew: a convertor and acceptor would generally be implemented as a single component
(14:43:55) icblenke: with a producer and a consumer aspect to it.
(14:44:04) laktop: let’s just ignore that for now
(14:44:19) f3ew: right
(14:44:27) f3ew: now, do we need acks?
(14:44:30) laktop: we know there will be some kind of accepter/converter, and it will almost definitely work both ways
(14:44:36) laktop: i say the acks are part of the protocol
(14:44:41) f3ew: or do we rely on the transport layer to tell us if that worked
(14:44:44) laktop: not part of the messages
(14:44:49) f3ew: yes
(14:44:56) f3ew: right, so it gets shoved out
(14:45:17) laktop: i don’t know if it gets pushed down into the transport
(14:45:24) f3ew: Ok, now, what happens if we get duplicate messages?
(14:45:25) icblenke: quandry: would more than one host publish the same resource name?
(14:45:36) f3ew: no
(14:45:42) laktop: what? hell yeah they would
(14:45:48) f3ew: resource name would include hostnames and ips
(14:45:53) laktop: well, depending on how you define resource
(14:45:55) icblenke: ie, would more than one host publish a "syslog" resource?
(14:46:01) laktop: definitely
(14:46:08) f3ew: Yes
(14:46:16) laktop: well wait; what is this "resource" we’re talking about?
(14:46:18) icblenke: would a consumer then get messages from mutliple "syslog" producers?
(14:46:27) laktop: we have producers, consumers, and messages; what’s a resource?
(14:46:35) laktop: icblenke: definitely
(14:46:42) icblenke: a resource is the thing published or subscribed to.
(14:46:42) f3ew: resource syslog { name = $hostname-$service; router = $router; }
(14:47:03) f3ew: an event
(14:47:05) laktop: wouldn’t that be a resource type? "I want all things related to syslog"
(14:47:08) icblenke: a type of event.
(14:47:12) f3ew: or rather, the type of event
(14:47:18) laktop: then say "type", not resource
(14:47:27) f3ew: yup
(14:47:29) icblenke: a specific event would be an accourance of a message published to a resource.
(14:47:34) **laktop hates terminology confusion
(14:47:35) icblenke: occourance.
(14:47:40) icblenke: whatever.
(14:47:53) laktop: and i think we should stick to "message", not "event", because "event" will be one of our message types
(14:48:01) icblenke: a specific event is then a message published to a type?
(14:48:14) laktop: i don’t have any idea what you’re talking about :/
(14:48:22) laktop: i don’t know what "published to a type" means
(14:48:25) icblenke: what is a "type"?
(14:48:41) icblenke: I think of those two things as wholely separate.
(14:48:43) laktop: i create a message; it has a "type" field; that type field can be things like "log", "metric", "alert"
(14:48:57) laktop: i drop that message on the runnel bus
(14:49:07) laktop: it gets routed to people who are subscribed to it
(14:49:12) laktop: s/people/consumers/
(14:49:20) icblenke: so you’re publishing a "metric", which will be consumed by anyone subscribing to a "metric"?
(14:49:46) laktop: yes, except i would describe the subscription as specifying both the field and the value
(14:49:53) laktop: so the subscription is "type == metric"
(14:49:57) icblenke: so a message has a source and a type, no destination.
(14:50:26) icblenke: I see what you’re saying.
(14:50:34) laktop: that would be my preference
(14:50:46) icblenke: make the subscription a logical construct that can specify various qualities of a message.
(14:50:48) laktop: i think that destinations could be provided through things like static routes in the router
(14:50:52) laktop: exactly
(14:51:06) f3ew: laktop you need one destination: the forst router
(14:51:08) icblenke: that’d work.
(14:51:08) f3ew: first
(14:51:18) laktop: and if you want to allow complicated things like pattern matching, that’s great, but we don’t need it
(14:51:30) icblenke: does the message really need to point to a router? that sounds like a transport thing.
(14:51:38) laktop: f3ew: well, you need someone to dump your message to; i don’t know if that’s the same thing as a destination
(14:51:41) laktop: yeah
(14:51:44) f3ew: f3ew Ok, what we want is an interface to accept events, where an event is defined as a message from an external service
(14:51:51) laktop: what we need is something like an MX record
(14:51:59) f3ew: Uh, no
(14:51:59) laktop: some way of determining who we send all messages to
(14:52:14) icblenke: sounds like a configuration detail.
(14:52:18) f3ew: just use SRV records in DNS
(14:52:22) f3ew: yup
(14:52:25) icblenke: MX records are configuration details moved to a global namespace.
(14:52:27) laktop: i’m not saying we literally need a dns record
(14:52:47) f3ew: implementation, not design there
(14:52:58) laktop: i’m just saying that the message producers need some way to figure out who to send messages to, and they need to send all of their messages there, and that’s all they need
(14:53:10) f3ew: right
(14:53:13) icblenke: IN SRV PUPPET ROUTER madhatter
(14:53:16) f3ew: a default route, if you will
(14:53:23) laktop: no, because it obviates the need for a destination
(14:53:25) laktop: exactly
(14:53:29) icblenke: heh.
(14:53:30) laktop: a default runnel :)
(14:53:43) f3ew: icblenke routers, not SMTP servers ;)
(14:54:31) icblenke: message producers need some way to identify the content they are sending uniquely enough for a consumer to identify that it wants to get that message.
(14:54:53) laktop: icblenke: i think that’s inline to the message
(14:55:04) icblenke: to that end, if you’re using an expressive subscribe syntax, you could key off of anything in the message.
(14:55:09) f3ew: icblenke message ids
(14:55:12) laktop: that is, you wouldn’t say, "this message is of type log" when it has a "type = syslog" field
(14:55:29) laktop: like we said, messages are just key/value pairs
(14:55:36) laktop: and one of the key/value pairs will be the message type
(14:55:38) icblenke: but if I wanted "type == syslog && version = 1"
(14:55:43) f3ew: icblenke a consumer indicates a wish to get messages of a certain type
(14:55:44) icblenke: er, ==1.
(14:55:47) laktop: and if we were to follow ldap then we could have multiple types
(14:55:51) f3ew: then it gets them all
(14:55:52) laktop: icblenke: that would be fine
(14:56:24) laktop: the more i think i about it the more i like the text portions of ldap as a solution
(14:56:33) laktop: their querying isn’t a bad way to do it
(14:56:36) icblenke: because my agent would be coded to understand version 1 syslog events. when we come out with version 2 events that add something that breaks version 1 event processing, we would need to upgrade that agent.
(14:56:47) laktop: obviously don’t use their transport or protocol or heirarchies or names
(14:56:56) icblenke: obviously. ;)
(14:57:08) laktop: but the inheritance and querying expressions aren’t bad, and i definitely like having someone to copy
(14:57:15) icblenke: OIDs and ASN.. ick.
(14:57:28) laktop: icblenke: no, we skip all the x.500 crap
(14:57:32) laktop: shudder
(14:57:39) f3ew: think Postfix config
(14:57:45) f3ew: or Exim
(14:57:57) laktop: to the human, it would look a lot like ldap, but to the computer it would look like http or smtp or something
(14:58:13) f3ew: Actually, it would look like key = value to both
(14:58:19) f3ew: :P
(14:58:23) laktop: you know what i mean, tho
(14:58:30) f3ew: yes
(14:58:54) f3ew: Ok, back to original thread
(14:58:59) icblenke: never really mastered ldap expressions myself.
(14:59:36) laktop: they’re stupid easy, esp. for default cases == "field=value"
(14:59:42) laktop: ok
(15:00:14) laktop: so, we have our messages, we have producers and consumers, and we have converters that accept non-runnel messages and convert them to runnel messages and add them to the runnel stream
(15:00:21) icblenke: if regex is supported, I’d be even happier.
(15:00:26) laktop: we have runnel routers, and we can subscribe to those runnel routers
(15:00:44) laktop: icblenke: that’s a good point, in terms of not copying ldap, although who’s regexes? :)
(15:00:59) icblenke: true ;)
(15:01:16) laktop: is there any other part of the system we’ve discussed?
(15:02:18) laktop: is there a special term we could apply to the messages? i keep thinking of little boats floating on the stream
(15:02:29) laktop: "microbarge" :)
(15:02:45) icblenke: I called my producers Feed, and my consumers Eat, and the messages Food.
(15:03:16) laktop: you, my friend, need to learn the difference between verbs and nouns :)
(15:03:28) icblenke: heh.
(15:04:27) laktop: i think generic terms suffice for nearly everything except the overall system and the format types
(15:04:53) laktop: in other words, we can’t just say, "convert your event to our message type", we need to be able to say "convert it to a runnel" or whatever
(15:05:06) laktop: i guess we could just call them runnel messages for now
(15:05:15) icblenke: "run runnel".. very logan’s run.
(15:05:19) laktop: he
(15:05:22) laktop: uh, heh
(15:05:42) f3ew: Ok: event
(15:05:54) f3ew: router—> stuff that actually sends "packets" elsewhere
(15:06:21) laktop: f3ew: that’s perfect
(15:07:34) f3ew: consumer—> stuff that accepts messages from a router as a final destination and processes them
(15:07:47) icblenke: indie baby?
(15:07:50) laktop: s/and processes them//
(15:07:58) laktop: it can just throw them away, we don’t care
(15:07:59) f3ew: yeah
(15:08:16) f3ew: that is still processing, but lets call that a processor
(15:08:27) laktop: and one of the things a consumer can do is convert them to an outgoing message type (e.g., snmp) and pass it along in that way
(15:08:53) laktop: and i know at least some people will want that
(15:08:55) f3ew: laktop yes
(15:09:02) f3ew: but that isn’t really relevant
(15:09:09) f3ew: the arrival of the message is an event
(15:09:18) f3ew: and that triggers a new chain
(15:09:23) laktop: well, your definition of "converter" implies they’re only incoming, but there will also be outgoing converters, too
(15:10:29) f3ew: Ok, the terminology is defined
(15:10:40) laktop: great
(15:10:52) f3ew: event -> acceptor -> convertor -> router -> consumer -> processor
(15:11:05) laktop: ok
(15:11:10) f3ew: a processors output is an action
(15:11:21) laktop: i wouldn’t go past the processor
(15:11:21) f3ew: event -> acceptor -> convertor -> router -> consumer -> processor -> action
(15:11:31) f3ew: Events result in actions
(15:11:42) f3ew: which might be null
(15:11:45) laktop: i’m uncomfortable with that claim
(15:11:51) f3ew: why?
(15:12:13) laktop: i just don’t know that it’s necessary for this discussion
(15:12:25) f3ew: we processed a message that a disk was 2% full, and acted to take no further action :)
(15:12:35) laktop: yeah, but i think you’ll have a lot of those
(15:12:42) f3ew: It isn’t. It just completes the whole message bus thing
(15:12:48) f3ew: who cares?
(15:13:05) kevc: obviously the solution is just to put the messages over irc anyway
(15:13:06) laktop: i’ll trust you on this one, because i’m ignorant here and you sure sound like you know what you’re talking about, but i’m still hesitant
(15:13:14) f3ew: rather than having some events having actions, they all have actions
(15:13:14) laktop: kevc: you’re fired
(15:13:18) kevc: :)
(15:13:39) laktop: kevc: although i can guarantee that i’ll have an irc consumer early on
(15:13:48) laktop: irc is a great notification system
(15:14:01) kevc: it is useful, yes
(15:14:18) f3ew: laktop It keeps things consistent
(15:14:27) f3ew: nothing more
(15:14:42) kevc: it’s a horrible protocol, but it does somehow tend to work
(15:14:56) laktop: f3ew: like i said, i’ll follow you’re lead here
(15:15:17) laktop: f3ew: do you want to send the definitions to the list, or do you want me to?
(15:15:35) laktop: hmm
(15:15:46) laktop: trac has a wiki, and i can get a trac instance up real fast
(15:15:50) laktop: should i do that?
(15:17:23) laktop: crickets
(15:17:32) icblenke: work intervenes.
(15:17:39) laktop: damn work!
(15:17:48) laktop: fortunately, my boss says this is my job right now
(15:17:57) icblenke: yeah. damn your boss.
(15:18:27) icblenke: working for yourself is hard, your boss is a tyrant.
(15:18:54) laktop: yep
(15:19:27) laktop: i sure hope this client in pdx turns out
(15:19:37) laktop: i’m thinking i should take down that post about bladelogic just because of that
(15:20:01) laktop: f3ew: i dunno if you saw this, but check http://madstop.com
(15:22:02) icblenke: while it’s ok to vent, it rarely helps to do it in a public place.
(15:22:24) laktop: i don’t think of the post as venting
(15:22:35) laktop: i don’t know exactly what i do think of it as
(15:23:43) laktop: i think one of the reasons companies tend to bully is because it’s so easy to get away with it, because people get afraid
(15:24:03) laktop: so the post is a way of letting others know what’s going on and that i’m particularly interested in being bullied
(15:24:26) laktop: let’s just say it’s not the first time i’ve chosen the louder of two options
(15:24:31) icblenke: and most of your audience will understand and respect that.
(15:24:53) laktop: rather, i’m not particularly interested in being bullied
(15:25:08) icblenke: got that ;)
(15:25:19) laktop: yeah, all 3 people who read madstop.com :)
(15:26:55) f3ew: back
(15:27:12) f3ew: laktop send them to the list, and a link to the log
(15:28:01) laktop: ok
(15:28:16) laktop: i’ll put the log up when i get back home, since my gaim instance at home will have the whole log
(15:28:31) laktop: do we want to continue the discussion now, or wait a bit?
(15:28:34) f3ew: cool
(15:28:59) laktop: i’ll be around more today and all day tomorrow, but i leave saturday and will be at LISA all next week and traveling and such the week after
(15:29:03) f3ew: Well, I have to be up at 6:30, I have a talk at 10 am and it is now 3 am
(15:29:09) laktop: ouch!
(15:29:16) laktop: might as well just stay up :)
(15:30:45) laktop: ok, i’ll head back home, post the log, and then email the definitions to the list
(15:30:49) laktop left the room (quit: "Leaving").
(15:30:52) f3ew: right, thanks
$Id: characterization.page 6 2006-07-02 20:05:14Z luke $