Internet-Draft | IMAP UIDBATCHES Extension | December 2024 |
Eggert | Expires 10 June 2025 | [Page] |
The UIDBATCHES extension of the Internet Message Access Protocol (IMAP) allows clients to retrieve UIDs from the server such that these UIDs split the messages of a mailbox into equally sized batches. This lets the client perform operations such as FETCH/SEARCH/STORE on these specific batches. This limits the number of messages that each command operates on, enabling better control over resource usage and response sizes.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 10 June 2025.¶
Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
This document defines an extension to the Internet Message Access Protocol [RFC3501] to retrieve UIDs that divide a mailbox's messages into evenly sized batches. This extension is compatible with both IMAP4rev1 [RFC3501] and IMAP4rev2 [RFC9051].¶
The purpose of this extension is to allow clients to (pre-)determine UID ranges that limit the number of messages that each command operates on. This is especially beneficial with [RFC9586] UIDONLY mode, where sequence numbers are unavailable to the client.¶
In protocol examples, this document uses a prefix of "C: " to denote lines sent by the client to the server, and "S: " for lines sent by the server to the client. Lines prefixed with "// " are comments explaining the previous protocol line. These prefixes and comments are not part of the protocol. Lines without any of these prefixes are continuations of the previous line, and no line break is present in the protocol unless specifically mentioned.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
Other capitalised words are IMAP keywords [RFC3501] or keywords from this document.¶
An IMAP server advertises support for the UIDBATCHES extension by including the UIDBATCHES capability in the CAPABILITY response / response code.¶
When the client sends a UIDBATCHES command to the server, the server will return the UID ranges that partition the messages in the currently selected mailbox into equally sized batches.¶
Batches are arranged by descending UID order, with the first batch containing the highest UIDs.¶
For a mailbox with <N> messages, requesting batches of size <M> will return the UID ranges corresponding to the sequence numbers¶
<N-M>:<N-M+1> <N-2*M>:<N-2*M+1> <N-3*M>:<N-3*M+1> ...¶
If the currently selected mailbox has 6823 messages and the client sends¶
C: A302 UIDBATCHES 2000¶
the server would return a response similar to¶
S: * UIDBATCHES (TAG "A302") UID ALL 163886:99703,99696:20358,20351:7841,7830:1 S: A302 OK UIDBATCHES Completed¶
The UID range 163886:99703
would span the first 2,000 messages in the mailbox; the UID range 99696:20358
would span the next 2,000 messages, the UID range 20351:7841
would span the next 2,000 messages, and finally the UID range 7830:1
would span the last 823 messages in the mailbox.¶
When the client selects a mailbox, it can use the UIDBATCHES command to find the UIDs that split the mailbox’s messages into batches. For example¶
C: A142 SELECT INBOX S: * 6823 EXISTS S: * 1 RECENT S: * OK [UNSEEN 12] Message 12 is first unseen S: * OK [UIDVALIDITY 3857529045] UIDs valid S: * OK [UIDNEXT 215296] Predicted next UID S: * FLAGS (\Answered \Flagged \Deleted \Seen \Draft) S: * OK [PERMANENTFLAGS (\Deleted \Seen \*)] Limited S: A142 OK [READ-WRITE] SELECT completed C: A143 UIDBATCHES 2000 S: * UIDBATCHES (TAG "A143") UID ALL 215295:99695,99696:20350,20351:7829,7830:1 S: A143 OK UIDBATCHES Completed¶
The client can then use these 4 UID ranges:¶
where each range has 2,000 messages in it, except for the last range, which only holds the remaining 823 messages.¶
Since new messages can not appear within these UID ranges, the number of messages in each range can not grow. It may decrease, though, as messages get deleted.¶
The client may choose to keep track of the number of EXPUNGE or VANISH messages and re-run UIDBATCHES when many messages have been deleted. The client MUST NOT excessively re-run UIDBATCHES and specifically MUST NOT re-run UIDBATCHES unless a minimum of N/2 messages have been deleted from the mailbox, where N is the batch size the client has requested.¶
Similarly, once new messages arrive into the mailbox, the client can start a new message batch 215296:*. Once N or more new messages have arrived, the client can then create a second new batch based on the UID of the N'th message. Alternatively, the client may choose to re-run UIDBATCHES. The client MUST NOT re-run UIDBATCHES if fewer than N/2 new messages have been received.¶
To clarify, the client MUST NOT re-run UIDBATCHES unless at least one of these conditions are met:¶
This extension puts a hard limit on the minimum batch size that a client can request, and it gives the server some flexibility in the actual size being returned. This is to ensure that the server has some room for flexibility in implementation to make this operation less resource-intensive. And to ensure that clients do not misuse this extension to deduce message sequence numbers.¶
The intent of this extension is to work well in combination with [RFC9586] UIDONLY mode without creating a de-facto loophole that re-introduces sequence numbers.¶
Section 3.2 also outlines some reasoning for these limitation.¶
The server MUST support batch sizes of 500 messages or larger.¶
The server MUST respond with BAD
and a response code TOO SMALL
if the client uses a batch size that is smaller than the minimum allowed by the server, e.g.¶
S: A302 BAD [TOO SMALL] Minimum batch size is 500¶
The server MUST NOT return ranges that contain more than the number of messages per batch requested by the client (2,000 in the above example). But the server MAY return fewer messages per range, notably if that makes the implementation simpler and/or more efficient.¶
Servers SHOULD NOT return batches that are substantially smaller, and SHOULD aim to be within 90% of the requested size. The client is likely to pick a batch count based on what it wants to display to the user. A client may e.g. request 2 batches of size 1,000 if it wants to be able to display the last 2,000 messages to the user.¶
A server MAY return batches that are substantially smaller if there are changes in mailbox state during the execution of the UIDBATCHES
command, namely messages being expunged, such that the overall size of the mailbox changes. The client would be able to infer this from the `EXPUNGE`, `VANISH`, or `EXISTS` messages it receives.¶
If the total number of messages is not evenly divisible by the requested batch size, the last batch will contain the remainder. Thus, the last batch in the mailbox (i.e. the batch with the lowest UIDs) will usually have fewer messages than the requested number of messages.¶
The server MUST reply with a UIDBATCHES response, even if no ranges are returned (so below). The UIDBATCHES response MUST include the tag of the command it relates to (similar to an ESEARCH
response), and it MUST include the UID indicator.¶
The UID ranges in the response must be ordered from the highest UIDs to the lowest, i.e. descending order of UIDs.¶
The server MAY return UID ranges with UIDs that do not exist on the server. The client as a result MUST NOT make assumptions about the existence of messages. If the server returns the response¶
S: * UIDBATCHES (TAG "A302") UID ALL 163886:99703,99696:20358,20351:7841,7830:1 S: A302 OK UIDBATCHES Completed¶
there may not be any messages on the server with the UIDs such as 163886, 99703, 99696, etc.¶
The range 163886:99703
will span approximately the requested number of messages (see note above), but its start and end UIDs may not correspond to messages on the server.¶
This gives the server implementation some flexibility as to which UID ranges to return. They might, e.g., return 163886:99697
and 99696:20358
instead of 163886:99703
and 99696:20358
-- assuming that there are no messages in the range 99704:99697
.¶
If there are fewer message in the mailbox than the requested batch size, the server would return a single batch that contains all messages in the mailbox.¶
Servers SHOULD end the last UID batch in the mailbox with UID 1 even if this UID does not exist on the server. This makes it unambiguous to the client that this range is in fact the last range.¶
A client can optionally provide a batch range. The server will then limit its response to the UID ranges corresponding to the specified batch indices. For example, if the client sends¶
C: A302 UIDBATCHES 2000 10:20¶
for a mailbox with more than 40,000 messages, the server would return the 10th to 20th batches, corresponding to the 20,000th and 40,000th message respectively.¶
Note that batches start at the highest UIDs: batch 1 is the batch with the highest UIDs.¶
The UID ranges that the server returns would still split the mailbox’s messages into batches of the requested size (2,000 in the example).¶
If the client requests more batches than exist on the server, the server would return those that do exist. For example if the client sends¶
C: A302 UIDBATCHES 2000 1:4¶
and the selected mailbox has 7,000 messages, the server would then return a UIDBATCHES response with only 4 UID ranges.¶
Similarly, if the requested batch size is equal to or larger than the number of messages in the mailbox, the server MUST return a response with a single UID range that spans all messages.¶
When the client issues any valid UIDBATCHES command and the mailbox is empty, the server MUST reply with a UIDBATCHES response. This UIDBATCHES response will not have an ALL
part, similar to a UID SEARCH that doesn't match any messages, e.g.¶
S: * UIDBATCHES (TAG "A302") UID S: A302 OK UIDBATCHES Completed¶
Similarly if the client requests a range of batches, and these batches do not exist, the server MUST reply with a UIDBATCHES response without an ALL
part. If the mailbox has 7,000 messages, and the client sends¶
C: A302 UIDBATCHES 2000 6:8¶
the server would respond with¶
S: * UIDBATCHES (TAG "A302") UID S: A302 OK UIDBATCHES Completed¶
The server MAY return fewer UID ranges than requested by the client even if the mailbox contains more messages. Servers may choose to do so to reduce resource demands when processing large mailboxes. The client knows what the message count in the mailbox is, and it can trivially determine if the server returned all UID ranges or not.¶
As noted in Section 3.1.5, the server SHOULD end the last UID batch in the mailbox with UID 1. This will make it easy for the client to know if the server did in fact return the last batch.¶
Servers MUST at least return the first 40 batches unless the client requested fewer. Servers SHOULD at least return the first 100 batches unless the client requested fewer.¶
The UIDBATCHES is in effect nothing more than shorthand for a UID SEARCH command of the form¶
C: A145 UID SEARCH RETURN () <N-M>,<N-2*M>,<N-3*M>,...¶
where N is the number of messages in the mailbox and M is the requested batch count.¶
The special purpose UIDBATCHES command, though, tries to address two problems:¶
By providing a special purpose command, servers can implement a different, optimized code path for determining message batches. And servers using the UIDONLY extension can provide a facility to let the client determine message batches without using sequence numbers in a UID SEARCH command.¶
Section 3.1.3 describes some implementation restrictions to ensure this.¶
The PARTIAL extension provides a different way for the client to split its commands into batches by using pages SEARCH and FETCH.¶
The intention of the UIDBATCHES command is to let the client pre-determine message batches of a desired size.¶
This makes it easier for the client to share implementation between servers regardless of their support of PARTIAL. And additionally, because the client can issue a corresponding UID SEARCH command to servers that do not implement UIDBATCHES, the client can use similar batching implementations for servers that support UIDBATCHES and those that do not.¶
When the server supports both the MESSAGELIMIT and UIDBATCHES extension, the client SHOULD request batches no larger than the specified maximum number of messages that can be processed in a single command. The client MAY choose to use a smaller batch size.¶
Additionally, since servers MAY limit the number of UIDs returned in response to UIDBATCHES, it is reasonable to assume that they would at most return N UIDs where N is the limit the server announced as its MESSAGELIMIT.¶
As noted above, the UIDBATCHES extension allows for clients to create UID ranges for message batches even when the connection operates in UIDONLY mode which otherwise doesn't allow for using message sequence numbers.¶
Servers that support SEARCHRES [RFC5182] MUST NOT store the result of UIDBATCHES in the $
variable.¶
The following syntax specification uses the Augmented Backus-Naur Form (ABNF) notation as specified in [RFC5234].¶
Non-terminals referenced but not defined below are as defined by IMAP4 [RFC3501].¶
Except as noted otherwise, all alphabetic characters are case-insensitive. The use of upper or lower case characters to define token strings is for editorial clarity only. Implementations MUST accept these strings in a case-insensitive fashion.¶
capability =/ "UIDBATCHES" ;; <capability> from [RFC3501] command-select =/ message-batches message-batches = "UIDBATCHES" SP nz-number [SP nz-number ":" nz-number] uidbatches-response = "UIDBATCHES" search-correlator SP "UID" [ALL tagged-ext-simple] mailbox-data =/ uidbatches-response¶
This document defines an additional IMAP4 capability. As such, it does not change the underlying security considerations of [RFC3501] and IMAP4rev2 [RFC9051].¶
This document defines an optimization that can both reduce the amount of work performed by the server, as well as the amount of data returned to the client. Use of this extension is likely to cause the server and the client to use less memory than when the extension is not used. However, as this is going to be new code in both the client and the server, rigorous testing of such code is required in order to avoid the introduction of new implementation bugs.¶
IMAP4 capabilities are registered by publishing a standards track or IESG approved Informational or Experimental RFC. The registry is currently located at:¶
https://www.iana.org/assignments/imap4-capabilities¶
IANA is requested to add registrations of the "UIDBATCHES" capability to this registry, pointing to this document.¶