What string method can be used to determine if the string contained in the variable text only consists of numbers?

Strings, which are widely used in Java programming, are a sequence of characters. In the Java programming language, strings are objects.

The Java platform provides the String class to create and manipulate strings.

Creating Strings

The most direct way to create a string is to write:

String greeting = "Hello world!";

In this case, "Hello world!" is a string literal—a series of characters in your code that is enclosed in double quotes. Whenever it encounters a string literal in your code, the compiler creates a String object with its value—in this case, Hello world!.

As with any other object, you can create String objects by using the new keyword and a constructor. The String class has thirteen constructors that allow you to provide the initial value of the string using different sources, such as an array of characters:

char[] helloArray = { 'h', 'e', 'l', 'l', 'o', '.' }; String helloString = new String(helloArray); System.out.println(helloString);

The last line of this code snippet displays hello.

Note: The String class is immutable, so that once it is created a String object cannot be changed. The String class has a number of methods, some of which will be discussed below, that appear to modify strings. Since strings are immutable, what these methods really do is create and return a new string that contains the result of the operation.

Methods used to obtain information about an object are known as accessor methods. One accessor method that you can use with strings is the length() method, which returns the number of characters contained in the string object. After the following two lines of code have been executed, len equals 17:

String palindrome = "Dot saw I was Tod"; int len = palindrome.length();

A palindrome is a word or sentence that is symmetric—it is spelled the same forward and backward, ignoring case and punctuation. Here is a short and inefficient program to reverse a palindrome string. It invokes the String method charAt(i), which returns the ith character in the string, counting from 0.

public class StringDemo { public static void main(String[] args) { String palindrome = "Dot saw I was Tod"; int len = palindrome.length(); char[] tempCharArray = new char[len]; char[] charArray = new char[len]; // put original string in an // array of chars for (int i = 0; i < len; i++) { tempCharArray[i] = palindrome.charAt(i); } // reverse array of chars for (int j = 0; j < len; j++) { charArray[j] = tempCharArray[len - 1 - j]; } String reversePalindrome = new String(charArray); System.out.println(reversePalindrome); } }

Running the program produces this output:

To accomplish the string reversal, the program had to convert the string to an array of characters (first for loop), reverse the array into a second array (second for loop), and then convert back to a string. The String class includes a method, getChars(), to convert a string, or a portion of a string, into an array of characters so we could replace the first for loop in the program above with

palindrome.getChars(0, len, tempCharArray, 0);

Concatenating Strings

The String class includes a method for concatenating two strings:

This returns a new string that is string1 with string2 added to it at the end.

You can also use the concat() method with string literals, as in:

"My name is ".concat("Rumplestiltskin");

Strings are more commonly concatenated with the + operator, as in

"Hello," + " world" + "!"

which results in

The + operator is widely used in print statements. For example:

String string1 = "saw I was "; System.out.println("Dot " + string1 + "Tod");

which prints

Such a concatenation can be a mixture of any objects. For each object that is not a String, its toString() method is called to convert it to a String.

Note: The Java programming language does not permit literal strings to span lines in source files, so you must use the + concatenation operator at the end of each line in a multi-line string. For example:

String quote = "Now is the time for all good " + "men to come to the aid of their country.";

Breaking strings between lines using the + concatenation operator is, once again, very common in print statements.

Creating Format Strings

You have seen the use of the printf() and format() methods to print output with formatted numbers. The String class has an equivalent class method, format(), that returns a String object rather than a PrintStream object.

Using String's static format() method allows you to create a formatted string that you can reuse, as opposed to a one-time print statement. For example, instead of

System.out.printf("The value of the float " + "variable is %f, while " + "the value of the " + "integer variable is %d, " + "and the string is %s", floatVar, intVar, stringVar);

you can write

String fs; fs = String.format("The value of the float " + "variable is %f, while " + "the value of the " + "integer variable is %d, " + " and the string is %s", floatVar, intVar, stringVar); System.out.println(fs);

Since R2020b

Create a string array that contains addresses.

str = ["221B Baker St.","Tour Eiffel Champ de Mars","4059 Mt Lee Dr."]

str = 1x3 string "221B Baker St." "Tour Eiffel Champ..." "4059 Mt Lee Dr."

To find addresses that contain numbers, create a pattern that matches an arbitrary number of digits by using the digitsPattern function.

pat = pattern Matching: digitsPattern

Return a logical array indicating which strings contain digits. Display the matching strings.

TF = 1x3 logical array 1 0 1

ans = 1x2 string "221B Baker St." "4059 Mt Lee Dr."

Search for strings that have a sequence of digits followed by one letter. You can build more complex patterns by combining simple patterns.

pat = digitsPattern + lettersPattern(1)

pat = pattern Matching: digitsPattern + lettersPattern(1)

TF = contains(str,pat); str(TF)

For a list of functions that create pattern objects, see pattern.

Credit: Jürgen Hermann, Horst Hansen

You need to check for the occurrence of any of a set of characters in a string.

The solution generalizes to any sequence (not just a string), and any set (any object in which membership can be tested with the in operator, not just one of characters):

def containsAny(str, set): """ Check whether sequence str contains ANY of the items in set. """ return 1 in [c in str for c in set] def containsAll(str, set): """ Check whether sequence str contains ALL of the items in set. """ return 0 not in [c in str for c in set]

While the find and count string methods can check for substring occurrences, there is no ready-made function to check for the occurrence in a string of a set of characters.

While working on a condition to check whether a string contained the special characters used in the glob.glob standard library function, I came up with the above code (with help from the OpenProjects IRC channel #python). Written this way, it really is compatible with human thinking, even though you might not come up with such code intuitively. That is often the case with list comprehensions.

The following code creates a list of 1/0 values, one for each item in the set:

[c in str for c in set]

Then this code checks whether there is at least one true value in that list:

1 in [c in str for c in set]

Similarly, this checks that no false values are in the list:

0 not in [c in str for c in set]

Usage examples are best cast in the form of unit tests to be appended to the .py source file of this module, with the usual idiom to ensure that the tests execute if the module runs as a main script:

if _ _name_ _ == "_ _main_ _": # unit tests, must print "OK!" when run assert containsAny('*.py', '*?[]') assert not containsAny('file.txt', '*?[]') assert containsAll('43221', '123') assert not containsAll('134', '123') print "OK!"

Of course, while the previous idioms are neat, there are alternatives (aren’t there always?). Here are the most elementary—and thus, in a sense, the most Pythonic—alternatives:

def containsAny(str, set): for c in set: if c in str: return 1 return 0 def containsAll(str, set): for c in set: if c not in str: return 0 return 1

Here are some alternatives that ensure minimal looping (earliest possible return). These are the most concise and thus, in a sense, the most powerful:

from operator import and_, or_, contains def containsAny(str, set): return reduce(or_, map(contains, len(set)*[str], set)) def containsAll(str, set): return reduce(and_, map(contains, len(set)*[str], set))

Here are some even slimmer variants of the latter that rely on a special method that string objects supply only in Python 2.2 and later:

from operator import and_, or_ def containsAny(str, set): return reduce(or_, map(str._ _contains_ _, set)) def containsAll(str, set): return reduce(and_, map(str._ _contains_ _, set))

And here is a tricky variant that relies on functionality also available in 2.0:

def containsAll(str, set): try: map(str.index, set) except ValueError: return 0 else: return 1

Fortunately, this rather tricky approach lacks an immediately obvious variant applicable to implement containsAny. However, one last tricky scheme, based on string.translate’s ability to delete all characters in a set, does apply to both functions:

import string notrans = string.maketrans('', '') # identity "translation" def containsAny(str, set): return len(set)!=len(set.translate(notrans, str)) def containsAll(str, set): return 0==len(set.translate(notrans, str))

This trick at least has some depth—it relies on set.translate(notrans, str) being the subsequence of set that is made of characters not in str. If that subsequence has the same length as set, no characters have been removed by set.translate, so no characters of set are in str. Conversely, if that subsequence has length 0, all characters have been removed, so all characters of set are in str. The translate method of string objects keeps coming up naturally when one wants to treat strings as sets of characters, partly because it’s so speedy and partly because it’s so handy and flexible. See Recipe 3.8 for another similar application.

One last observation is that these different ways to approach the task have very different levels of generality. At one extreme, the earliest approaches, relying only on in (for looping on str and for membership in set) are the most general; they are not at all limited to string processing, and they make truly minimal demands on the representations of str and set. At the other extreme, the last approach, relying on the translate method, works only when both str and set are strings or closely mimic string objects’ functionality.


Page 2

Get full access to Python Cookbook and 60K+ other titles, with free 10-day trial of O'Reilly.

There's also live online events, interactive content, certification prep materials, and more.

Credit: Jürgen Hermann, Nick Perkins

Given a set of characters to keep, you need to build a filtering functor (a function-like, callable object). The specific functor you need to build is one that, applied to any string s, returns a copy of s that contains only characters in the set.

The string.maketrans function and translate method of string objects are fast and handy for all tasks of this ilk:

import string # Make a reusable string of all characters _allchars = string.maketrans('', '') def makefilter(keep): """ Return a functor that takes a string and returns a partial copy of that string consisting of only the characters in 'keep'. """ # Make a string of all characters that are not in 'keep' delchars = _allchars.translate(_allchars, keep) # Return the functor, binding the two strings as default args return lambda s, a=_allchars, d=delchars: s.translate(a, d) def canonicform(keep): """ Given a string, considered as a set of characters, return the string's characters as a canonic-form string: alphabetized and without duplicates. """ return makefilter(keep)(_allchars) if _ _name_ _ == '_ _main_ _': identifier = makefilter(string.letters + string.digits + '_') print identifier(_allchars)

The key to understanding this recipe lies in the definitions of the translate and maketrans functions in the string module. translate takes a string and replaces each character in it with the corresponding character in the translation table passed in as the second argument, deleting the characters specified in the third argument. maketrans is a utility routine that helps create the translation tables.

Efficiency is vastly improved by splitting the filtering task into preparation and execution phases. The string of all characters is clearly reusable, so we build it once and for all when this module is imported. That way, we ensure that each filtering functor has a reference to the same string of all characters, not wasting any memory. The string of characters to delete depends on the set of characters to keep, so we build it in the makefilter factory function. This is done quite rapidly using the translate method to delete the characters to keep from the string of all characters. The translate method is very fast, as are the construction and execution of these useful little functors. The solution also supplies an extremely simple function to put any set of characters, originally an arbitrary string, into canonic-string form (alphabetically sorted, without duplicates). The same trick encapsulated in the canonicform function is also explicitly used in the test code that is executed when this runs as a script.

Of course, you don’t have to use lambda (here or anywhere else). A named function local to the factory function will do just as well. In other words, this recipe works fine if you change makefilter’s return statement into the following two statements:

def filter(s, a=_allchars, d=delchars): return s.translate(a, d) return filter

Many Pythonistas would consider this clearer and more readable.

This isn’t a big issue, but remember that lambda is never necessary. In any case in which you find yourself straining to fit code into a lambda’s limitations (i.e., just an expression, with no statements allowed), you can and should always use a local named function instead, to avoid all the limitations and problems.

With Python 2.2, or Python 2.1 and a from _ _future_ _ import nested_scopes, you get lexically nested scopes, so that if you want to, you can avoid binding _allchars and delchars as default values for arguments in the returned functor. However, it is (marginally) faster to use this binding anyway: local variables are the fastest kind to access, and arguments are nothing but prebound local variables. Globals and names from nested scopes require a little more effort from the interpreter (and sometimes, perhaps more significantly, from a human being who is reading the code). This is why we bind _allchars as argument a here despite the fact that, in any release of Python, we could have just accessed it as a global variable.

Documentation for the maketrans function in the string module in the Library Reference.

Get Python Cookbook now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.