Skip to main content

Features

PMSS offers a range of powerful features and options designed to enhance message scanning capabilities.

Multi-Part SMS

Messages that exceed length or size limits can be easily managed using the Multi-Part SMS endpoint. This method ensures that messages are accurately segmented and then reassembled. Each segment is scanned individually, and once all parts are received, the message is reassembled and the complete message is scanned again. This approach greatly enhances the reliability and integrity of message transmission.

The /scan/v1/sms/long/{type} query must include the SMS Segmentation and Reassembly (SAR) values. These values are incorporated into the query request as illustrated below.

{
"sar_msg_ref_num": 42152,
"sar_total_segments": 3,
"sar_segment_seqnum": 2
}

SMS Encoding Support

PMSS honors the foundational principles of SMS while addressing contemporary requirements, regardless of the specific role your application plays within the mobile ecosystem.

A standard established by GSM in the 1990s set the original payload limit to 140 bytes, which was integrated into network signaling protocols. At the time, efficiency of data transmission received a lot of focus. GSM-7 encoding let 160 characters fit into that 140 byte payload. Seems small, but the savings of 1 bit per character was a big deal.

Binary, DCS 4 maintains a size of 140 bytes in its raw format. In contrast, Unicode, which includes emojis, enables DCS 8 to efficiently pack 70 characters into the same space utilizing UTF-16 encoding. Furthermore, for multipart SMS, the GSM 7-bit encoding can accommodate 153 characters per 140-byte segment, which also includes the necessary headers.

To ensure compatibility and reliability across different messaging platforms, it's important to handle messages in any SMS encoding format. This approach is key for effective communication and accurate message delivery, regardless of the platform in use. Modern networks have reduced the impact of these constraints, but legacy systems continue to operate.

Supported formats include:

  • GSM-7 / GSM-7-Packed
  • UCS-2 (UTF-16 BE/LE)
  • 8-bit Binary
  • Latin-1

The DCS selection for each query should match the table provided below; straying from this will result in a query error.

TypeDCSDescriptionMax Length
ao0, 3Latin-1140 bytes
ao48-bit binary140 bytes
ao8UTF-16 (UCS2)70 characters
at0, 3Latin-1140 bytes
at48-bit binary140 bytes
at8UTF-16 (UCS2)70 characters
mo0GSM 7-bit160 characters
mo48-bit binary140 bytes
mo8UTF-16 (UCS2)70 characters
mt0GSM 7-bit160 characters
mt48-bit binary140 bytes
mt8UTF-16 (UCS2)70 characters

Notes:

  • DCS 0 (MT/MO): Use for GSM 7-bit encoding (160 characters max).
  • DCS 0/3 (AT/AO): Latin-1 encoding (ISO-8859-1, 140 bytes max).
  • DCS 4: Use for ASCII-to-binary (140 bytes max).
  • DCS 8: Plain text with UTF-16/UCS2 encoding (70 characters max)
  • DCS 8: UTF-16 can be overridden with UTF-16LE or UTF-16BE with the "encode_option" parameter.
    • encode_option = utf16le or utf16be
note

It is important to keep in mind that single-part SMS messages come with their own character limits. When entering the realm of multi-part SMS, complexity increases as the length per segment decreases due to the addition of the User Data Header (UDH). For example, when working with a GSM 7-bit encoded message, the maximum usable length is 153 characters.

UTF16 LE/BE Override

The accurate DCS value is crucial for all SMS queries and must be incorporated into the request as outlined below. For messages with DCS 8, UTF-16 can be overridden with UTF-16LE or UTF-16BE using the "encode_option" parameter in the request body.

Below is an example of an "AO" message with "DCS 8" with UTF-16LE encoding overriding UTF-16.

{
"dcs": 8,
"encode_option": "utf16le"
}

Encoding Enforcement Modes

From core to edge, PMSS SMS encoding options combine historical precision with modern flexibility.

By default, PMSS runs in "relaxed" mode, meaning it will accept messages with lengths/sizes that exceed the message's encoding restrictions.

If desired, queries can be run in "strict" mode, meaning PMSS will return an error for messages that exceed the encoding restriction limits.

Response: 
{
"response": {
"code": 400,
"message": "Message body exceeds Latin-1 limit of 140 bytes for type 'ao' (got 317 bytes)"
}
}

Alternatively, queries can be run in "info" mode, where the strict checks are performed and message encoding info is returned in the response.

Response:
{
"response": {
"action": "allow",
"code": 200
},
"encoding_info": {
"valid": false,
"encoding": "latin1",
"max_length": 134,
"length": 317
}
}

MDN Country Resolution

Country codes for both senders and recipients play a vital role in our policy frameworks. These help ensure compliance with local regulations and allow for tailored messaging strategies that resonate with specific audiences.

The API response includes these codes.

Response:
{
"response": {
"code": 200,
"action": "accept"
},
"from_meta": {
"ccode": "US"
},
"to_meta": {
"ccode": "GB"
}
}

Content Categorization

PMSS has the ability to identify, report on, and restrict messages containing commonly regulated and restricted content categories based on operator and country wide acceptable use policies.

Message & Image Clustering

PMSS can spot different versions of the same message by evaluating similarity thresholds with a range of hashing algorithms, including messages with images as content.

Sender Class Framework

Limits & Thresholds

Sender classes give Proofpoint's Managed Security Services team the flexibility to manage sender limits, thresholds, and policy behaviors in a performant manner. Classes can be assigned based on different criteria, like string identifiers or numeric scores, which allows for customizing our approach to sender management.

warning

If no sender class is provided, then a default "untrusted" classifier will be used.

  • Strings can be mapped to predefined classes (e.g., mapping "suspect_sender_1234" to "suspect")
  • Numeric values (1-100) are validated and mapped using customer-specific schemas or a published global schema, ensuring consistency and scalability.

To include a sender class in your query, include as outlined below.

{
"sender_class": "good_sender"
}
{
"sender_class": 80
}