This section documents internal structs. Since they are truly internal, we can and will change them occasionally which might make this section slightly out of date at times.
Curl_easystruct is the one returned to the outside in the external API as an opaque
CURL *. This pointer is usually known as an easy handle in API documentations and examples.
Information and state that is related to the actual connection is in the
connectdatastruct. When a transfer is about to be made, libcurl will either create a new connection or re-use an existing one. The current connectdata that is used by this handle is pointed out by
Data and information that regard this particular single transfer is put in the
Curl_easystruct is added to a multi handle, as it must be in order to do any transfer, the
->multimember will point to the
Curl_multistruct it belongs to. The
->nextmembers will then be used by the multi code to keep a linked list of
Curl_easystructs that are added to that same multi handle. libcurl always uses multi so
->multiwill point to a
Curl_multiwhen a transfer is in progress.
->mstateis the multi state of this particular
multi_runsingle()is called, it will act on this handle according to which state it is in. The mstate is also what tells which sockets to return for a specific
curl_multi_fdset()] is called etc.
The libcurl source code generally use the name
dataeverywhere for the local variable that points to the
When doing multiplexed HTTP/2 transfers, each
Curl_easyis associated with an individual stream, sharing the same connectdata struct. Multiplexing makes it even more important to keep things associated with the right thing!
A general idea in libcurl is to keep connections around in a connection "cache" after they have been used in case they will be used again and then re-use an existing one instead of creating a new one as it creates a significant performance boost.
connectdatastruct identifies a single physical connection to a server. If the connection cannot be kept alive, the connection will be closed after use and then this struct can be removed from the cache and freed.
Thus, the same
Curl_easycan be used multiple times and each time select another
connectdatastruct to use for the connection. Keep this in mind, as it is then important to consider if options or choices are based on the connection or the
As a special complexity, some protocols supported by libcurl require a special disconnect procedure that is more than just shutting down the socket. It can involve sending one or more commands to the server before doing so. Since connections are kept in the connection cache after use, the original
Curl_easymay no longer be around when the time comes to shut down a particular connection. For this purpose, libcurl holds a special dummy
Curl_multistruct to use when needed.
FTP uses two TCP connections for a typical transfer but it keeps both in this single struct and thus can be considered a single connection for most internal concerns.
The libcurl source code generally uses the name
connfor the local variable that points to the connectdata.
Internally, the easy interface is implemented as a wrapper around multi interface functions. This makes everything multi interface.
Curl_multiis the multi handle struct exposed as the opaque
CURLM *in external APIs.
This struct holds a list of
Curl_easystructs that have been added to this handle with [
curl_multi_add_handle()]. The start of the list is
->num_easyis a counter of added
->msglistis a linked list of messages to send back when [
curl_multi_info_read()] is called. Basically a node is added to that list when an individual
Curl_easy's transfer has completed.
->hostcachepoints to the name cache. It is a hash table for looking up name to IP. The nodes have a limited lifetime in there and this cache is meant to reduce the time for when the same name is wanted within a short period of time.
->timetreepoints to a tree of
Curl_easys, sorted by the remaining time until it should be checked - normally some sort of timeout. Each
Curl_easyhas one node in the tree.
->sockhashis a hash table to allow fast lookups of socket descriptor for which
Curl_easyuses that descriptor. This is necessary for the
->conn_cachepoints to the connection cache. It keeps track of all connections that are kept after use. The cache has a maximum size.
->closure_handleis described in the
The libcurl source code generally uses the name
multifor the variable that points to the
Each unique protocol that is supported by libcurl needs to provide at least one
Curl_handlerstruct. It defines what the protocol is called and what functions the main code should call to deal with protocol specific issues. In general, there's a source file named
[protocol].cin which there's a
struct Curl_handler Curl_handler_[protocol]declared. In
url.cthere's then the main array with all individual
Curl_handlerstructs pointed to from a single array which is scanned through when a URL is given to libcurl to work with.
The concrete function pointer prototypes can be found in
->schemeis the URL scheme name, usually spelled out in uppercase. That is "HTTP" or "FTP" etc. SSL versions of the protocol need their own
Curl_handlersetup so HTTPS separate from HTTP.
->setup_connectionis called to allow the protocol code to allocate protocol specific data that then gets associated with that
Curl_easyfor the rest of this transfer. It gets freed again at the end of the transfer. It will be called before the
connectdatafor the transfer has been selected/created. Most protocols will allocate its private
struct [PROTOCOL]here and assign
->connect_itallows a protocol to do some specific actions after the TCP connect is done, that can still be considered part of the connection phase. Some protocols will alter the
connectdata->sendfunction pointers in this function.
->connectingis similarly a function that keeps getting called as long as the protocol considers itself still in the connecting phase.
->do_itis the function called to issue the transfer request. What we call the DO action internally. If the DO is not enough and things need to be kept getting done for the entire DO sequence to complete,
->doingis then usually also provided. Each protocol that needs to do multiple commands or similar for do/doing needs to implement their own state machines (see SCP, SFTP, FTP). Some protocols (only FTP and only due to historical reasons) have a separate piece of the DO state called
->doingkeeps getting called while issuing the transfer request command(s)
->donegets called when the transfer is complete and DONE. That is after the main data has been transferred.
->do_moregets called during the
DO_MOREstate. The FTP protocol uses this state when setting up the second connection.
->perform_getsockFunctions that return socket information. Which socket(s) to wait for which I/O action(s) during the particular multi state.
->disconnectis called immediately before the TCP connection is shutdown.
->readwritegets called during transfer to allow the protocol to do extra reads/writes
->attachattaches a transfer to the connection.
->defportis the default report TCP or UDP port this protocol uses
->protocolis one or more bits in the
CURLPROTO_*set. The SSL versions have their "base" protocol set and then the SSL variation. Like "HTTP|HTTPS".
->flagsis a bitmask with additional information about the protocol that will make it get treated differently by the generic engine:
PROTOPT_SSL- will make it connect and negotiate SSL
PROTOPT_DUAL- this protocol uses two connections
PROTOPT_CLOSEACTION- this protocol has actions to do before closing the connection. This flag is no longer used by code, yet still set for a bunch of protocol handlers.
PROTOPT_DIRLOCK- "direction lock". The SSH protocols set this bit to limit which "direction" of socket actions that the main engine will concern itself with.
PROTOPT_NONETWORK- a protocol that does not use the network (read
PROTOPT_NEEDSPWD- this protocol needs a password and will use a default one unless one is provided
PROTOPT_NOURLQUERY- this protocol cannot handle a query part on the URL (?foo=bar)
Is a hash table with connections for later re-use. Each
Curl_easyhas a pointer to its connection cache. Each multi handle sets up a connection cache that all added
Curl_easys share by default.
The libcurl share API allocates a
Curl_sharestruct, exposed to the external API as
The idea is that the struct can have a set of its own versions of caches and pools and then by providing this struct in the
CURLOPT_SHAREoption, those specific
Curl_easys will use the caches/pools that this share handle holds.
Curl_easystructs can be made to share specific things that they otherwise would not, such as cookies.
Curl_sharestruct can currently hold cookies, DNS cache and the SSL session cache.
This is the main cookie struct. It holds all known cookies and related information. Each
Curl_easyhas its own private
CookieInfoeven when they are added to a multi handle. They can be made to share cookies by using the share API.