## 9.2 Input Size Revisited
So far it has usually sufficed to call *n* the input size in our algorithms because *n* has been a reasonable measure of the amount of data in the input. For example, in the case of sorting algorithms, *n*, the number of keys to be sorted, is a good measure of the amount of data in the input. So we called *n* the input size. However, we must not inadvertently call *n* the input size in an algorithm. Consider the following algorithm, which determines whether a positive integer *n* is prime.
```
bool prime (int n)
{
int i; bool switch;
switch = true;
i = 2;
while (switch && i ≤n1/2)
if (n % i == 0) // % returns the remainder
switch = false; // when n is divided by i.
else
i++;
return switch;
```
The number of passes through the **while** loop in this prime-checking algorithm is clearly in Θ(*n*1/2). However, is it a polynomial-time algorithm? The parameter *n* is the input to the algorithm; it is not the size of the input. That is, each value of *n* constitutes an instance of the problem. This is unlike a sorting algorithm, for example, in which *n* is the number of keys and the instance is the *n* keys. If the value of *n* is the input and not the size of the input in function *prime*, what is the size of the input? We return to this question after we define input size more concretely than we did in [Section 1.3](LiB0010.html#57).
Definition For a given algorithm, the ***input size*** is defined as the number of characters it takes to write the input.
This definition is not different from that given in [Section 1.3](LiB0010.html#57). It is only more specific about how we measure the size of the input. To count the characters it takes to write the input, we need to know how the input is encoded. Suppose that we encode it in binary, which is used inside computers. Then the characters used for the encoding are binary digits (bits), and the number of characters it takes to encode a positive integer *x* is \[lg *x*\] + 1. For example, 31 = 111112 and \[lg 31\] + 1 = 5. We simply say that it takes about lg *x* bits to encode a positive integer *x* in binary. Suppose that we use binary encoding and we wish to determine the input size for an algorithm that sorts *n* positive integers. The integers to be sorted are the inputs to the algorithm. Therefore, the input size is the count of the number of bits it takes to encode them. If the largest integer is *L*, and we encode each integer in the number of bits needed to encode the largest, then it takes about lg *L* bits to encode each of them. The input size for the *n* integers is therefore about *n* lg *L.* Suppose that instead we choose to encode the integers in base 10. Then the characters used for the encoding are decimal digits, it takes about log *L* characters to encode the largest, and the input size for the *n* integers is about *n* log *L.* Because
![](https://box.kancloud.cn/34dfcf29cf218685eb3a13b34684cba6_195x22.jpg)
an algorithm is polynomial-time in terms of one of these input sizes if and only if it is polynomial-time in terms of the other.
If we restrict ourselves to "reasonable" encoding schemes, then the particular encoding scheme used does not affect the determination of whether an algorithm is polynomial-time.. There does not seem to be a satisfactory formal definition of "reasonable." However, for most algorithms we usually agree on what is reasonable. For example, for any algorithm in this text, any base other than 1 could be used to encode an integer without affecting whether or not the algorithm is polynomial-time. We would therefore consider any such encoding system to be reasonable. Encoding in base 1, which is called the unary form, would not be considered reasonable.
In the preceding chapters, we simply called *n*, the number of keys to be sorted, the input size in our sorting algorithms. Using *n* as the input size, we showed that the algorithms are polynomial-time. Do they remain polynomialtime when we are precise about the input size? Next we illustrate that they do. When we are being precise about input size, we also need to be more precise (than we were in [Section 1.3.1](LiB0010.html#58)) about the definition of worst-case time complexity. The precise definition follows.
Definition For a given algorithm, *W* (*s*) is defined as the maximum number of steps done by the algorithm for an input size of *s.* *W* (*s*) is called the ***worst-case time complexity*** of the algorithm.
A ***step*** can be considered the equivalent of one machine comparison or assignment, or, to keep the analysis machine-independent, one bit comparison or assignment. This definition is not different from that given in [Section 1.3](LiB0010.html#57). It is only more specific about the basic operation. That is, according to this definition, each step constitutes one execution of the basic operation. We used s instead of *n* for the input size because (1) the parameter *n* to our algorithms is not always a measure of the input size (e.g., in the prime-checking algorithm presented at the start of this section) and (2) when *n* is a measure of the input size, it is not ordinarily a precise measure of it. According to the definition just given, we must count all the steps done by the algorithm. Let's illustrate how we can do this, while still avoiding the details of the implementation, by analyzing [Algorithm 1.3](LiB0008.html#32) (Exchange Sort). For simplicity, assume that the keys are positive integers and that there are no other fields in the records. Look again at [Algorithm 1.3](LiB0008.html#32). The number of steps done to increment loops and do branching is bounded by a constant *c* times *n*2. If the integers are sufficiently large, they cannot be compared or assigned in one step by the computer. We saw a similar situation when we discussed large integer arithmetic in [Section 2.6](LiB0020.html#207). Therefore, we should not consider one key comparison or assignment as one step. To keep our analysis machine-independent, we consider one step to be either one bit comparison or one bit assignment. Therefore, if *L* is the largest integer, it takes at most lg *L* steps to compare one integer or to assign one integer. When we analyzed [Algorithm 1.3](LiB0008.html#32) in [Sections 1.3](LiB0010.html#57) and [Sections 7.2](LiB0053.html#633), we saw that in the worst-case Exchange Sort does *n*(*n* − 1)2 comparisons of keys and 3*n*(*n* − 1) /2 assignments to sort *n* positive integers. Therefore, the maximum number of steps done by Exchange Sort is no greater than
![](https://box.kancloud.cn/733a80b6fd69cabab7318748dac31b91_298x70.jpg)
Let's use *s* = *n* lg *L* as the input size. Then
![](https://box.kancloud.cn/efbbc33d493c3e09df6672425cb36e7e_400x110.jpg)
We have shown that Exchange Sort remains polynomial-time when we are precise about the input size. We can obtain similar results for all the algorithms, which we've shown to be polynomial-time using imprecise inputsizes. Furthermore, we can show that the algorithms, which we've shown to be nonpolynomial-time (e.g., Algorithm 3.11), remain nonpolynomial-time when we are precise about the input size. We see that when *n* is a measure of the amount of data in the input, we obtain correct results concerning whether an algorithm is polynomial-time by simply using *n* as the input size. Therefore, when this is the case, we continue to use *n* as the input size.
We return now to the prime-checking algorithm. Because the input to the algorithm is the value of *n*, the input size is the number of characters it takes to encode *n.* Recall that if we are using base 10 encoding, it takes ∟log *n*⌋ + 1 characters to encode *n.* For example, if the number is 340, it takes 3 decimal digits, not 340, to encode it. In general, if we are using base 10 encoding, and if we set
![](https://box.kancloud.cn/f58b8b6dabce6ade1f5ace73e5c84cd8_75x19.jpg)
then *s* is approximately the size of the input. In the worst-case, there are ∟*n*1/2⌋– 1 passes through the loop in function *prime.* Because *n* = 10*s*, the worst-case number of passes through the loop is about 10*s*/2. Because the total number of steps is at least equal to the number of passes through the loop, the time complexity is nonpolynomial. If we use binary encoding, it takes about lg *n* characters to encode *n.* Therefore, if we use binary encoding,
![](https://box.kancloud.cn/92c7b3289e3d9a8847daac792c0392d3_63x19.jpg)
is about equal to the input size, and the number of passes through the loop is about equal to 2*r*/2. The time complexity is still nonpolynomial. The result remains unchanged as long as we use a "reasonable" encoding scheme. We mentioned previously that we would not consider unary encoding "reasonable." If we used that encoding scheme, it would take *n* characters to encode the number *n*. For example, the number 7 would be encoded as 1111111. Using this encoding, the prime-checking algorithm has a polynomial time complexity. So we see that our results do change if we stray from a "reasonable" encoding system.
In algorithms such as the prime-checking algorithm, we call *n* a ***magnitude*** in the input. We've seen other algorithms whose time complexities are polynomials in terms of magnitude(s) but are not polynomials in terms of size. The time complexity of [Algorithm 1.7](LiB0009.html#51) for computing the *n*th Fibonacci term is in Θ (*n*). Because *n* is a magnitude in the input and lg *n* measures the size of the input, the time complexity of [Algorithm 1.7](LiB0009.html#51) is linear in terms of magnitudes but exponential in terms of size. The time complexity of [Algorithm 3.2](LiB0025.html#263) for computing the binomial coefficient is in Θ (*n*2). Because *n* is a magnitude in the input and lg *n* measures the size of the input, the time complexity of [Algorithm 3.2](LiB0025.html#263) is quadratic in terms of magnitudes but exponential in terms of size. The time complexity, of the dynamic programming algorithm for the 0–1 Knapsack problem discussed in Section 4.4.4 is in Θ (*nW*). In this algorithm, *n* is a measure of size because it is the number of items in the input. However, *W* is a magnitude because it is the maximum capacity of the knapsack; lg *W* measures the size of *W*. This algorithm's time complexity is polynomial in terms of magnitudes and size, but it is exponential in terms of size alone.
An algorithm whose worse-case time complexity is bounded above by a polynomial function of its size and magnitudes is called ***pseudopolynomialtime***. Such an algorithm can often be quite useful because it is inefficient only when confronted with instances containing extremely large numbers. Such instances might not pertain to the application of interest. For example, in the 0–1 Knapsack problem, we might often be interested in cases where *W* is not extremely large.
- Table of Contents
- BackCover
- Foundations of Algorithms Using C++ Pseudocode, Third Edition
- Preface
- Chapter Contents
- Pedagogy
- Course Outlines
- Acknowledgments
- Errors
- Chapter 1: Algorithms - Efficiency, Analysis, and Order
- 1.2 The Importance of Developing Efficient Algorithms
- 1.3 Analysis of Algorithms
- 1.4 Order
- 1.5 Outline of This Book
- Exercises
- Chapter 2: Divide-and-Conquer
- 2.1 Binary Search
- 2.2 Mergesort
- 2.3 The Divide-and-Conquer Approach
- 2.4 Quicksort (Partition Exchange Sort)
- 2.5 Strassen's Matrix Multiplication Algorithm
- 2.6 Arithmetic with Large Integers
- 2.7 Determining Thresholds
- 2.8 When Not to Use Divide-and-Conquer
- Exercises
- Chapter 3: Dynamic Programming
- 3.1 The Binomial Coefficient
- 3.2 Floyd's Algorithm for Shortest Paths
- 3.3 Dynamic Programming and Optimization Problems
- 3.4 Chained Matrix Multiplication
- 3.5 Optimal Binary Search Trees
- 3.6 The Traveling Salesperson Problem
- Exercises
- Chapter 4: The Greedy Approach
- 4.1 Minimum Spanning Trees
- 4.2 Dijkstra's Algorithm for Single-Source Shortest Paths
- 4.3 Scheduling
- 4.4 Huffman Code
- 4.5 The Greedy Approach versus Dynamic Programming: The Knapsack Problem
- Exercises
- Chapter 5: Backtracking
- 5.2 The n-Queens Problem
- 5.3 Using a Monte Carlo Algorithm to Estimate the Efficiency of a Backtracking Algorithm
- 5.4 The Sum-of-Subsets Problem
- 5.5 Graph Coloring
- 5.6 The Hamiltonian Circuits Problem
- 5.7 The 0-1 Knapsack Problem
- Exercises
- Chapter 6: Branch-and-Bound
- 6.1 Illustrating Branch-and-Bound with the 0 - 1 Knapsack problem
- 6.2 The Traveling Salesperson Problem
- 6.3 Abductive Inference (Diagnosis)
- Exercises
- Chapter 7: Introduction to Computational Complexity - The Sorting Problem
- 7.2 Insertion Sort and Selection Sort
- 7.3 Lower Bounds for Algorithms that Remove at Most One Inversion per Comparison
- 7.4 Mergesort Revisited
- 7.5 Quicksort Revisited
- 7.6 Heapsort
- 7.6.1 Heaps and Basic Heap Routines
- 7.6.2 An Implementation of Heapsort
- 7.7 Comparison of Mergesort, Quicksort, and Heapsort
- 7.8 Lower Bounds for Sorting Only by Comparison of Keys
- 7.8.1 Decision Trees for Sorting Algorithms
- 7.8.2 Lower Bounds for Worst-Case Behavior
- 7.8.3 Lower Bounds for Average-Case Behavior
- 7.9 Sorting by Distribution (Radix Sort)
- Exercises
- Chapter 8: More Computational Complexity - The Searching Problem
- 8.1 Lower Bounds for Searching Only by Comparisons of Keys
- 8.2 Interpolation Search
- 8.3 Searching in Trees
- 8.4 Hashing
- 8.5 The Selection Problem: Introduction to Adversary Arguments
- Exercises
- Chapter 9: Computational Complexity and Interactability - An Introduction to the Theory of NP
- 9.2 Input Size Revisited
- 9.3 The Three General Problem Categories
- 9.4 The Theory of NP
- 9.5 Handling NP-Hard Problems
- Exercises
- Chapter 10: Number-Theoretic Algorithms
- 10.1 Number Theory Review
- 10.2 Computing the Greatest Common Divisor
- 10.3 Modular Arithmetic Review
- 10.4 Solving Modular Linear Equations
- 10.5 Computing Modular Powers
- 10.6 Finding Large Prime Numbers
- 10.7 The RSA Public-Key Cryptosystem
- Exercises
- Chapter 11: Introduction to Parallel Algorithms
- 11.1 Parallel Architectures
- 11.2 The PRAM Model
- Exercises
- Appendix A: Review of Necessary Mathematics
- A.2 Functions
- A.3 Mathematical Induction
- A.4 Theorems and Lemmas
- A.5 Logarithms
- A.6 Sets
- A.7 Permutations and Combinations
- A.8 Probability
- Exercises
- Appendix B: Solving Recurrence Equations - With Applications to Analysis of Recursive Algorithms
- B.2 Solving Recurrences Using the Characteristic Equation
- B.3 Solving Recurrences by Substitution
- B.4 Extending Results for n, a Power of a Positive Constant b, to n in General
- B.5 Proofs of Theorems
- Exercises
- Appendix C: Data Structures for Disjoint Sets
- References
- Index
- Index_B
- Index_C
- Index_D
- Index_E
- Index_F
- Index_G
- Index_H
- Index_I
- Index_J
- Index_K
- Index_L
- Index_M
- Index_N
- Index_O
- Index_P
- Index_Q
- Index_R
- Index_S
- Index_T
- Index_U
- Index_V
- Index_W-X
- Index_Y
- Index_Z
- List of Figures
- List of Tables
- List of Algorithms, Examples, and Theorems
- List of Sidebars