The Formal Connector functions as a protocol-aware reverse proxy that interprets various wire-protocols including HTTP, Postgres, MySQL, and Kubernetes to grant security teams enhanced visibility and control over their data flows.
Customers typically begin deploying Formal for a single data store category, then expand coverage to additional data stores within their infrastructure. To maintain straightforward Formal deployments, the engineering team developed functionality enabling the Connector to listen on a solitary port for multiple technologies. This capability operates by recognizing the incoming wire-protocol and subsequently loading the corresponding wire-protocol interpreter.
Initial Approach
The foundational strategy involved examining the first bytes of any TCP connection to identify the protocol before directing packets to the appropriate interpreter.
The []byte data extracted from the connection was subsequently transmitted and explicitly written before employing io.Copy to relay the traffic. This method functioned adequately for most protocols; however, MongoDB presented complications due to its distinctive protocol framing.
func (h *connectionHandler) classify(conn net.Conn) (string, []byte, error) {
buffer := make([]byte, 1024)
n, err := conn.Read(buffer)
if err != nil {
return "", nil, fmt.Errorf("error reading from connection: %w", err)
}
protocol := h.classifiers.ClassifyFlow(buffer[:n])
technology, ok := h.protocols[protocol]
if !ok {
return "", nil, fmt.Errorf("classified protocol not handled: %s", protocol)
}
return technology, buffer[:n], nil
}
func (h *connectionHandler) handleMultipleServers(ctx context.Context, conn net.Conn) error {
technology, initialBytes, err := h.classify(conn)
if err != nil {
return fmt.Errorf("error while classifying connection: %w", err)
}
server, ok := h.servers[technology]
if !ok {
return fmt.Errorf("server not found for technology: %s", technology)
}
// Explicitly write the initial bytes before using io.Copy
_, err = conn.Write(initialBytes)
if err != nil {
return fmt.Errorf("error writing initial bytes: %w", err)
}
return server.HandleConnection(ctx, conn)
}
MongoDB mandates particular framing for its communications. Transmitting the initial bytes separately fractured the framing architecture, triggering communication breakdowns. The displaced initial bytes produced unsuccessful connections and protocol misinterpretations.
The Solution
The complications were overcome by encapsulating the initial bytes within a MultiReader alongside the connection. This modification guaranteed that the proxy transmitted both the initial bytes and the remaining connection sequence in a unified operation, safeguarding MongoDB's intended message structure.
The revised implementation wraps the connection in peekConn, permitting straightforward application of io.Copy without demanding an explicit Write. This modification maintained MongoDB's protocol framing while preserving compatibility with other protocols.
type peekConn struct {
net.Conn
reader io.Reader
}
func (pc *peekConn) Read(b []byte) (int, error) {
return pc.reader.Read(b)
}
func (h *connectionHandler) classify(conn net.Conn) (string, net.Conn, error) {
buffer := make([]byte, 1024)
n, err := conn.Read(buffer)
if err != nil {
return "", nil, fmt.Errorf("error reading from connection: %w", err)
}
protocol := h.classifiers.ClassifyFlow(buffer[:n])
technology, ok := h.protocols[protocol]
if !ok {
return "", nil, fmt.Errorf("classified protocol not handled: %s", protocol)
}
return technology, &peekConn{
Conn: conn,
reader: io.MultiReader(bytes.NewReader(buffer[:n]), conn),
}, nil
}
func (h *connectionHandler) handleMultipleServers(ctx context.Context, conn net.Conn) error {
technology, peekConn, err := h.classify(conn)
if err != nil {
return fmt.Errorf("error while classifying connection: %w", err)
}
server, ok := h.servers[technology]
if !ok {
return fmt.Errorf("server not found for technology: %s", technology)
}
// No explicit Write needed - peekConn seamlessly replays initial bytes
return server.HandleConnection(ctx, peekConn)
}
This resolution demonstrated the significance of comprehending protocol-specific characteristics when executing deep packet inspection and protocol identification. It underscored the necessity of flexibility and the intricacies inherent in network systems engineering.