module String
Library functions for strings.
Usage
import String;
Dependencies
extend Exception;
import List;
import ParseTree;
Description
The following library functions are defined for strings:
- arbString
- capitalize
- center
- charAt
- chars
- contains
- deescape
- endsWith
- escape
- findAll
- findFirst
- findLast
- format
- fromBase32
- fromBase64
- indent
- isEmpty
- isValidCharacter
- left
- replaceAll
- replaceFirst
- replaceLast
- reverse
- rexpMatch
- right
- size
- split
- squeeze
- startsWith
- stringChar
- stringChars
- substitute
- substring
- toBase32
- toBase64
- toInt
- toLocation
- toLowerCase
- toReal
- toUpperCase
- trim
- uncapitalize
- wrap
function center
Center a string in given space.
str center(str s, int n)
str center(str s, int n, str pad)
- Center string
sin string of lengthnusing spaces. - Center string
sin string of lengthnusingpadas padding character.
Examples
rascal>import String;
ok
rascal>center("abc", 10);
str: " abc "
───
abc
───
rascal>center("abc", 10, "x");
str: "xxxabcxxxx"
───
xxxabcxxxx
───
function charAt
Return character in a string by its index position.
int charAt(str s, int i) throws IndexOutOfBounds
Return the character at position i in string s as integer character code.
Also see StringChar that converts character codes back to string.
Examples
rascal>import String;
ok
rascal>charAt("abc", 0);
int: 97
rascal>stringChar(charAt("abc", 0));
str: "a"
───
a
───
function chars
Return characters of a string.
list[int] chars(str s)
Return a list of the characters of s as integer character codes.
Also see StringChars that converts character codes back to string.
Examples
rascal>import String;
ok
rascal>chars("abc");
list[int]: [97,98,99]
rascal>stringChars(chars("abc")) == "abc";
bool: true
function contains
Check that a string contains another string.
bool contains(str input, str find)
Check whether the string find occurs as substring in the string subject.
Examples
rascal>import String;
ok
rascal>contains("abracadabra", "bra");
bool: true
rascal>contains("abracadabra", "e");
bool: false
function deescape
Replace escaped characters by the escaped character itself (using Rascal escape conventions).
str deescape(str s)
function endsWith
Check whether a string ends with a given substring.
bool endsWith(str subject, str suffix)
Yields true if string subject ends with the string suffix.
Examples
rascal>import String;
ok
rascal>endsWith("Hello.rsc", ".rsc");
bool: true
function escape
Replace single characters in a string.
str escape(str subject, map[str,str] mapping)
Return a copy of subject in which each single character key in replacements
has been replaced by its associated value.
Examples
rascal>import String;
ok
rascal>import IO;
ok
rascal>escape("abracadabra", ("a" : "AA", "c" : "C"));
str: "AAbrAACAAdAAbrAA"
───
AAbrAACAAdAAbrAA
───
rascal>L = escape("\"Good Morning\", he said", ("\"": "\\\""));
str: "\\\"Good Morning\\\", he said"
───
\"Good Morning\", he said
───
rascal>println(L);
\"Good Morning\", he said
ok
function findAll
Find all occurrences of a string in another string.
list[int] findAll(str subject, str find)
Find all occurrences of string find in string subject.
The result is a (possible empty) list of positions where find matches.
See also Find First and Find Last.
Examples
rascal>import String;
ok
rascal>findAll("abracadabra", "a");
list[int]: [0,3,5,7,10]
rascal>findAll("abracadabra", "bra");
list[int]: [1,8]
rascal>findAll("abracadabra", "e");
list[int]: []
function findFirst
Find the first occurrence of a string in another string.
int findFirst(str subject, str find)
Find the first occurrence of string find in string subject.
The result is either a position in subject or -1 when find is not found.
Also see Find All and Find Last.
Examples
rascal>import String;
ok
rascal>findFirst("abracadabra", "a");
int: 0
rascal>findFirst("abracadabra", "bra");
int: 1
rascal>findFirst("abracadabra", "e");
int: -1
function findLast
Find the last occurrence of a string in another string.
int findLast(str subject, str find)
Find the last occurrence of string find in string subject.
The result is either a position in subject or -1 when find is not found.
Also see Find All and Find First.
Examples
rascal>import String;
ok
rascal>findLast("abracadabra", "a");
int: 10
rascal>findLast("abracadabra", "bra");
int: 8
rascal>findLast("abracadabra", "e");
int: -1
function isEmpty
Check whether a string is empty.
bool isEmpty(str s)
Returns true if string s is empty.
Examples
rascal>import String;
ok
rascal>isEmpty("");
bool: true
rascal>isEmpty("abc");
bool: false
function arbString
Generate a arbitrary string.
str arbString(int n)
Returns a string of maximum n length, with arbitrary characters.
Examples
rascal>import String;
ok
rascal>arbString(3);
str: "𒀃𒋈𡅡𒆒𒁃"
───
𒀃𒋈𡅡𒆒𒁃
───
rascal>arbString(10);
str: "\t\t\n\t \u2028"
───
───
function left
Left alignment of string in given space.
str left(str s, int n)
str left(str s, int n, str pad)
- Left align string
sin string of lengthnusing spaces. - Left align string
sin string of lengthnusingpadas pad character.
Examples
rascal>import String;
ok
rascal>left("abc", 10);
str: "abc "
───
abc
───
rascal>left("abc", 10, "x");
str: "abcxxxxxxx"
───
abcxxxxxxx
───
function replaceAll
Replace all occurrences of a string in another string.
str replaceAll(str subject, str find, str replacement)
Return a copy of subject in which all occurrences of find (if any) have been replaced by replacement.
Also see Replace First and Replace Last.
Examples
rascal>import String;
ok
rascal>replaceAll("abracadabra", "a", "A");
str: "AbrAcAdAbrA"
───
AbrAcAdAbrA
───
rascal>replaceAll("abracadabra", "ra", "RARA");
str: "abRARAcadabRARA"
───
abRARAcadabRARA
───
rascal>replaceAll("abracadabra", "cra", "CRA");
str: "abracadabra"
───
abracadabra
───
Pitfalls
Note that find is a string (as opposed to, for instance, a regular expression in Java).
function replaceFirst
Replace the first occurrence of a string in another string.
str replaceFirst(str subject, str find, str replacement)
Return a copy of subject in which the first occurrence of find (if it exists) has been replaced by replacement.
Also see Replace All and Replace Last.
Examples
rascal>import String;
ok
rascal>replaceFirst("abracadabra", "a", "A");
str: "Abracadabra"
───
Abracadabra
───
rascal>replaceFirst("abracadabra", "ra", "RARA");
str: "abRARAcadabra"
───
abRARAcadabra
───
rascal>replaceFirst("abracadabra", "cra", "CRA");
str: "abracadabra"
───
abracadabra
───
Pitfalls
Note that find is a string (as opposed to, for instance, a regular expression in Java).
function replaceLast
Replace the last occurrence of a string in another string.
str replaceLast(str subject, str find, str replacement)
Return a copy of subject in which the last occurrence of find (if it exists) has been replaced by replacement.
Also see Replace First and Replace Last.
Examples
rascal>import String;
ok
rascal>replaceLast("abracadabra", "a", "A");
str: "abracadabrA"
───
abracadabrA
───
rascal>replaceLast("abracadabra", "ra", "RARA");
str: "abracadabRARA"
───
abracadabRARA
───
rascal>replaceLast("abracadabra", "cra", "CRA");
str: "abracadabra"
───
abracadabra
───
Pitfalls
Note that find is a string (as opposed to, for instance, a regular expression in Java).
function reverse
Return a string with all characters in reverse order.
str reverse(str s)
Returns string with all characters of string s in reverse order.
Examples
rascal>import String;
ok
rascal>reverse("abc");
str: "cba"
───
cba
───
function right
Right alignment of a string value in a given space.
Right align s in string of length n using space.
str right(str s, int n)
str right(str s, int n, str pad)
- Right align string
sin string of lengthnusing spaces. - Right align string
sin string of lengthnusingpadas pad character.
Examples
rascal>import String;
ok
rascal>right("abc", 10);
str: " abc"
───
abc
───
rascal>right("abc", 10, "x");
str: "xxxxxxxabc"
───
xxxxxxxabc
───
Examples
rascal>import String;
ok
rascal>right("abc", 10);
str: " abc"
───
abc
───
rascal>right("abc", 10, "x");
str: "xxxxxxxabc"
───
xxxxxxxabc
───
function size
Determine length of a string value.
int size(str s)
Returns the length (number of characters) in string s.
Examples
rascal>import String;
ok
rascal>size("abc");
int: 3
rascal>size("");
int: 0
function startsWith
Check whether a string starts with a given prefix.
bool startsWith(str subject, str prefix)
Yields true if string subject starts with the string prefix.
Examples
rascal>import String;
ok
rascal>startsWith("Hello.rsc", "Hell");
bool: true
function stringChar
Convert a character code into a string.
str stringChar(int char) throws IllegalArgument
function stringChars
Convert a list of character codes into a string.
str stringChars(list[int] chars) throws IllegalArgument
function isValidCharacter
Check that a given integer value is a valid Unicode code point.
bool isValidCharacter(int ch)
function substring
Extract a substring from a string value.
str substring(str s, int begin)
str substring(str s, int begin, int end)
- Yields substring of string
sfrom indexbeginto the end of the string. - Yields substring of string
sfrom indexbeginto (but not including) indexend.
Examples
rascal>import String;
ok
rascal>substring("abcdef", 2);
str: "cdef"
───
cdef
───
rascal>substring("abcdef", 2, 4);
str: "cd"
───
cd
───
function toInt
Convert a string value to integer.
int toInt(str s) throws IllegalArgument
int toInt(str s, int r) throws IllegalArgument
- Converts string
sto integer. - Convert string
sto integer using radixr.
Throws IllegalArgument when s cannot be converted.
Examples
rascal>import String;
ok
rascal>toInt("11");
int: 11
rascal>toInt("11", 8);
int: 9
Now try an erroneous argument:
rascal>toInt("abc");
|std:///String.rsc|(10801,456,<430,0>,<450,52>): IllegalArgument("abc","For input string: \"abc\"")
at *** somewhere ***(|std:///String.rsc|(10801,456,<430,0>,<450,52>))
at toInt(|prompt:///|(6,5,<1,6>,<1,11>))
function toLowerCase
Convert the characters in a string value to lower case.
str toLowerCase(str s)
Convert all characters in string s to lowercase. Also see To Upper Case.
Examples
rascal>import String;
ok
rascal>toLowerCase("AaBbCc123");
str: "aabbcc123"
───
aabbcc123
───
function toReal
Convert a string value to real.
real toReal(str s)
Converts string s to a real. Throws IllegalArgument when s cannot be converted.
Examples
rascal>import String;
ok
rascal>toReal("2.5e-3");
real: 0.0025
rascal>toReal("123");
real: 123.
rascal>toReal("abc");
|std:///String.rsc|(11678,320,<471,0>,<484,31>): IllegalArgument()
at *** somewhere ***(|std:///String.rsc|(11678,320,<471,0>,<484,31>))
at toReal(|prompt:///|(7,5,<1,7>,<1,12>))
function toUpperCase
Convert the characters in a string value to upper case.
str toUpperCase(str s)
Converts all characters in string s to upper case.
Also see To Lower Case.
Examples
rascal>import String;
ok
rascal>toUpperCase("AaBbCc123");
str: "AABBCC123"
───
AABBCC123
───
function trim
Returns string with leading and trailing whitespace removed.
str trim(str s)
Examples
rascal>import String;
ok
rascal>trim(" jelly
|1 >>>>beans ");
str: "jelly\nbeans"
───
jelly
beans
───
function squeeze
Squeeze repeated occurrences of characters.
str squeeze(str src, str charSet)
Squeeze repeated occurrences in src of characters in charSet removed.
See http://commons.apache.org/lang/api-2.6/index.html?org/apache/commons/lang/text/package-summary.html[Apache]
for the allowed syntax in charSet.
Examples
rascal>import String;
ok
rascal>squeeze("hello", "el");
str: "helo"
───
helo
───
the other squeeze function uses character class types instead:
rascal>squeeze("hello", "el") == squeeze("hello", #[el]);
bool: true
function squeeze
Squeeze repeated occurrences of characters.
str squeeze(str src, type[&CharClass] _:type[![]] _)
Squeeze repeated occurrences in src of characters, if they are a member of &CharClass, removed.
srcis any string&CharClassis a reified character class type such as[a-z](a type that is a subtype of the class of all characters![])- To pass in a char-class type used the type reifier operator:
#[a-z]or#![]
Examples
rascal>import String;
ok
rascal>squeeze("hello", #[el]);
str: "helo"
───
helo
───
Benefits
- to squeeze all characters use the universal character class:
#. - this function is type-safe; you can only pass in correct reified character classes like
#[A-Za-z].
Pitfalls
![]excludes the 0'th unicode character, so we can not squeeze the unicode codepoint0using this function. If you really need to squeeze 0 then it's best to write your own:
visit (x) {
case /<dot:.>+/ => "\a00" when dot == "\a00"
}
- Do not confuse the character
0(codepoint 48) with the zero codepoint:#[0] != #[\a00]
function split
Split a string into a list of strings based on a literal separator.
list[str] split(str sep, str src)
function capitalize
str capitalize(str src)
function uncapitalize
str uncapitalize(str src)
function toBase64
Base-64 encode the characters of a string.
str toBase64(str src, str charset=DEFAULT_CHARSET, bool includePadding=true)
Convert the characters of a string to a list of bytes using UTF-8 encoding and then encode these bytes using base-64 encoding as defined by RFC 4648: https://www.ietf.org/rfc/rfc4648.txt.
function fromBase64
Decode a base-64 encoded string.
str fromBase64(str src, str charset=DEFAULT_CHARSET)
Convert a base-64 encoded string to bytes and then convert these bytes to a string using the specified cahracter set. The base-64 encoding used is defined by RFC 4648: https://www.ietf.org/rfc/rfc4648.txt.
function toBase32
Base-32 encode the characters of a string.
str toBase32(str src, str charset=DEFAULT_CHARSET, bool includePadding=true)
Convert the characters of a string to a list of bytes using UTF-8 encoding and then encode these bytes using base-32 encoding as defined by RFC 4648: https://www.ietf.org/rfc/rfc4648.txt.
function fromBase32
Decode a base-32 encoded string.
str fromBase32(str src, str charset=DEFAULT_CHARSET)
Convert a base-32 encoded string to bytes and then convert these bytes to a string using the specified cahracter set. The base-32 encoding used is defined by RFC 4648: https://www.ietf.org/rfc/rfc4648.txt.
function wrap
Word wrap a string to fit in a certain width.
str wrap(str src, int wrapLength)
Inserts newlines in a string in order to fit the string in a certain width. It only breaks on spaces (' ').
function format
str format(str s, str dir, int n, str pad)
function rexpMatch
Determine if a string matches the given (Java-syntax) regular expression.
bool rexpMatch(str s, str re)
function toLocation
Convert a string value to a (source code) location.
loc toLocation(str s)
- Converts string
sto a location. - If the scheme is not provided, it is assumed to be
cwd.
Examples
rascal>import String;
ok
rascal>toLocation("http://grammarware.net");
loc: |http://grammarware.net|
rascal>toLocation("document.xml");
loc: |cwd:///document.xml|
function substitute
Substitute substrings in a string based on a substitution map from location to string.
str substitute(str src, map[loc,str] s)
Examples
rascal>import String;
ok
rascal>substitute("abc", (|stdin:///|(1,1): "d"))
str: "adc"
───
adc
───
function indent
Indent a block of text
str indent(str indentation, str content, bool indentFirstLine=false)
Every line in content will be indented using the characters
of indentation.
Benefits
- This operation executes in constant time, independent of the size of the content or the indentation.
- Indent is the identity function if
indentation == ""
Pitfalls
- This function works fine if
indentationis not spaces or tabs; but it does not make much sense.