python1-1

String in Python

Textual data in Python is handled with str objects.
Since default handling of Text in Python2 and Python3 is different, it is important
to understand the difference.
In Python 2 , by default, ‘str’ object is saved as an array of bytes.
In Python version 3+ , by default, ‘str’ object is saved as unicode points.

# Python 2.7
message_py2 = "I like Python"
print(type(message_py2))
<type 'str'>

# Python 3.6
message_py3 = "I like Python"
print(type(message_py3))
<class 'str'>

Python 2: If we use “u” prefix, we get a “unicode” object.
Byte strings and unicode strings each have a method to convert it to the other type of string.
Unicode strings have a .encode() method that produces bytes, and
byte strings have a .decode() method that produces unicode.
Each takes an argument, which is the name of the encoding to use for the operation.

# Python 2
my_unicode = u"Hi \u2119\u01B4\u2602\u210c\u00F8\u1F24"
print(type(my_unicode))
<type 'unicode'>

print(my_unicode);
Hi ℙƴ☂ℌøἤ

# to convert unicode to bytes using .encode()
my_utf8 = my_unicode.encode('utf-8');
print(type(my_utf8));
<type 'str'>

# to convert byte to unicode
print(type(my_utf8.decode('utf-8')))
<type 'unicode'>

Python3: if we use “b” we can save strings as byte strings

#Python 3
my_unicode = "Hi \u2119\u01B4\u2602\u210c\u00F8\u1F24"
my_bytes = b"Hi \u2119\u01B4\u2602\u210c\u00F8\u1F24"
print(type(my_bytes))
<class 'bytes'>

# To convert bytes to unicode
print(type(my_bytes.decode("utf-8")))
<class 'str'>

Commonly used methods with strings

str.capitalize()
Return a copy of the string with its first character capitalized and the rest lowercased.

[name.capitalize() for name in ("amar", "akbar", "anthony")]

Output:
['Amar', 'Akbar', 'Anthony']

str.center(width[, fillchar])
Return centered in a string of length width. Padding is done using the specified fillchar

"Happy B'day".center(20, '!')
"!!!!Happy B'day!!!!!"

str.count(sub[, start[, end]])
Return the number of occurrences of substring sub in the range.

"I scream, you scream, we all scream, for ice cream!".count("cream")
4

str.endswith(suffix[, start[, end]])
Return True if the string ends with the specified suffix, otherwise return False.
suffix can also be a tuple of suffixes to look for.

suffix = ('ish', 'less')
words = ['childish', 'ageless','fanciful', 'lawless']

[word for word in words if word.endswith(suffix)]
['childish', 'ageless', 'lawless']

str.find(sub[, start[, end]])
Return the lowest index in the string where substring sub is found

"I thought a thought".find("thought")
2

str.join(iterable)
Return a string which is the concatenation of the strings in iterable

','.join(['Red','Blue','Green'])
'Red,Blue,Green'

str.ljust(width[, fillchar])
Return the string left justified in a string of length width. Padding is done using the specified fillchar

for word in ['One self', 'two pair', 'three crowd']:
    print(word.ljust(15,'-'))
One self-------
two pair-------
three crowd----

str.lower()
Return a copy of the string converted to lowercase.

email_id = ['Abc@Gmail.com', 'kbc@YAHOO.COM']
for email in email_id:
    print(email.lower())
abc@gmail.com
kbc@yahoo.com

str.split(sep=None, maxsplit=-1)
Return a list of the words in the string, using sep as the delimiter string.

'Red,Blue,Green,Yellow'.split(',')
['Red', 'Blue', 'Green', 'Yellow']

str.strip([chars])
Return a copy of the string with the leading and trailing characters removed.

'   i have taking some extra space, squeeze me   '.strip()
'i have taking some extra space, squeeze me'

str.upper()
Return a copy of the string with all the cased characters converted to uppercase.

loud_words = ['scream', 'yell', 'shout']

for word in loud_words:
    print(word.upper())
SCREAM
YELL
SHOUT

Shekhar Pandey

Add comment