A string is a sequence of characters enclosed within quotes. Like tuple, strings are also immutable objects, we cannot modify a string, or part of a string.
"Hello World!" is a string literal, and its data-type is 'str'.
A string must be enclosed within a pair of single quotes, double quotes or triple qoutes.
In the above sequence we could insert both apostrophe s and double quotes.
In the lowest level of computer functioning, information are stored as (binary) numbers. All characters or digits or symbols, printable or non-printable, have an integer value mapped to it. For example, the character 'A' is mapped to the integer 65, the symbol '#' is mapped to 35.
Python ord() function accepts a single character and returns the integer value that represents that given character.
It is the reverse of the function chr().
Python chr() function accepts an integer and returns the character value that represents the given integer.
It is the reverse of the function ord().
Python chr() function accepts an object, converts it to a string and returns the same.
Python len() function can be used to find the length of a string.
String indexing works exactly like
list indexing and
tuple indexing,
so you should go through them first.
Consider the string assigned to variable x:
s = "javascript"
In the above, s[0] points to the single character 'j', s[9] points to the single character 't'.
Every character of a string has an index depending on its position.
An index is an integer inside a pair of square brackets
that is placed immediately after the string variable/object.
An index of zero refers to the first character of a string. The last character
can be accessed with the index len(s) - 1
A single character or value of a string can be accessed by the index of that character.
s[0] points to the 1st character 'j',
s[1] points to the 2nd character 'a',
...
s[9] points to the last character 't'.
Calling s[10] will raise an IndexError
because there is no character at index 10.
s[1], s[2] ...
s[9] are independent variables.
Length of the variable s is 10,
since it contains 10 characters,
which can be checked with len(s) function.
Like lists and tuples, strings also support negative indexing. The first character can be accessed with the index -len(s). The last character of a string always has an index -1.
All of the following equality comparisons and identity comparisons return True:
Slicing strings works in the same way as
tuple slicing
does.
By slicing a string, a range of items can be accessed.
Consider the string
s2 = "networking".
Here, len(s2), i.e., number of character is 10.
Applying a slice [3:7] on this string,
we get an output 'work'
Python starts at index 3 (character 'w'), and goes on including,
stops before index 7 (character 'i'').
The character at index 7 is not included.
The slice notation [3:] translates to
"begin with character at index 3 and include rest of the string".
Like list slicing, the slice notation [::-1] returns a string with reversed order of characters.
With the slice notation [-1:-6:-2], negative step forces count backwards. Counting starts from the last item, stops before index -6, and every alternate items are included.
Note: The slice notation [:] and [::] can copy a string, as done in case of lists.
Strings being immutable, no slice assignment is allowed with a string object.
Similar to tuples, we can create a slice object and use with strings.
Once this slice object last_four is created, we can apply it to any string.
You can take a look in here to study how slice() function works.
Python membership operators in and not in can be used to check whether a character or a sequence of characters is present in a string or not.
In pyhon, string literals can be added, or better we say, joined with the + operator. This is called string concatenation.
In the above, the + is
also known as concatenation operator.
The * operator also can join string literals in a different way. print("SoS-" * 4) will produce a string of series of "SoS-".
In the above, the *
is also known as repetition operator.
Note that the difference between
7 and "7" has
produced different outputs.
In the format string * integer, the
integer can be negative, but an empty string will be returned.
Suppose we want to display the string c:\user\public. Try this in the IDLE shell.
This happens because the backslash \ has a special meaning
inside a python string.
If the character next to the \
is a, b, f, n, N, r, t, u, U, v, x, 0, 1, 2, 3, 4, 5, 6, or 7,
then either we do not get the desired output, or we get an error.
In fact if a backslash \ and the next character sequence belong
to the following table, that sequence has a special meaning.
Escape Sequence | Meaning | |
---|---|---|
1 | \<newline> | Backslash and newline ignored |
2 | \\ | Backslash (\) |
3 | \' | Single quote (') |
4 | \" | Double quote (") |
5 | \a | ASCII Bell (BEL) |
6 | \b | ASCII Backspace (BS) |
7 | \f | ASCII Formfeed (FF) |
8 | \n | ASCII Linefeed (LF) |
9 | \r | ASCII Carriage Return (CR) |
10 | \t | ASCII Horizontal Tab (TAB) |
11 | \v | ASCII Vertical Tab (VT) |
12 | \ooo | Character with octal value ooo |
13 | \xhh | Character with hex value hh |
14 | \N{name} | Character named name in the Unicode database |
15 | \u{xxxx} | Character with 16-bit hex value xxxx |
16 | \U{xxxxxxxx} | Character with 32-bit hex value xxxxxxxx |
In the world of computer programming, the \n
inside a string denotes a newline character,
and \t denotes a tab character, and the list continues.
Those are known as
escape sequence,
and almost every programming language has their use.
For example, you cannot display the string "s\t" in a simple way. Try print("s\t") in the IDLE shell.
You end up with a single s as output.
Actually python prints a tab character
after s, we just cannot seet it.
If the character \ and the characters next to it
belong to the
above table,
then the character sequence
will not be printed, rather python prints something special,
like a tab or a newline etc.
If the character \ and the characters next to it
do not belong to escape sequence group,
then they will be printed as it is.
For example, the character sequence \p does not
belong to the escape sequence group,
so print("s\pu")
will exactly display the string s\pu.
Try print("s\tu") in the IDLE shell.
You see that s u is the output.
The t is escaped with a backslash in the above.
When t is escaped,
it loses its normal meaning (a literal t character),
and gains a special meaning (a tab character).
The \n also belongs to the escape sequence group, and stands for a newline character.
The n is escaped with a backslash.
The \n splits the string in
two lines, adding a newline character in the middle.
The same effect can be achieved by using \n as a separator.
So how to print the exact string s\t ? We escape the backslash \ with another backslash \.
When we escape the \ (with another backslash), it (the second \) loses its original meaning (escaping function), and is literally printed.
In the same way, to display the exact string s\n, we escape the backslash with \\.
If we want to display the string c:\user\public,
we can escape the backslash with another backslash.
Take the example of the following string.
When we escape the " with a backslash, it loses its original meaning (containing a piece of string), and is literally displayed.
The above string can be displayed by using only single quotes with the help of escape mechanism.
When we escape the ' with a backslash, it loses its original meaning (containing a piece of string), and is literally displayed.
Type the following statement in the IDLE shell, without terminating it with a double quote:
s = "This is a single \
and press the ENTER key:
You will see a '...' which means the interpreter is waiting
for end of input, because you did not terminate the string
with a double quote.
Now type long line" and press ENTER, variable
s will contain the whole line without the
<newline> character. A print(s) will show this.
You actually escaped the <newline> with a \, so the interpreter ignored it.
--- x ---
Want to leave a message for me?