Explore beam search – a heuristic search algorithm


In 1977, Turing Award winner Dr. Raj Reddy of Carnegie Mellon University coined the term “beam search”. Beam search is a best-first search optimization that reduces memory requirements. The best-first search is a graph search that ranks all partial solutions (states) based on a heuristic. However, only a predetermined number of the best partial solutions are retained as candidates in the beam search. It is therefore a greedy algorithm.


Beam search builds its search tree using breadth-first search. It generates all the successors of the states at the current level at each level of the tree, sorting them in ascending order of heuristic cost. However, it only records a limited number of the best states at each level (called the beamwidth). Thereafter, it will only extend to these states. The more pruned states, the wider the beam. No state is pruned when the beamwidth is infinite and beam search is the same as breadth-first search.

Moreover, the beamwidth limits the amount of memory needed to perform the search. Finally, because it can prune a target state, beam search sacrifices completeness (the guarantee that an algorithm will end with a solution if one exists). As a result, the beam search is inefficient (i.e. there is no guarantee that it will find the best solution).

Beam Search has different parts:

  • A problem to solve,
  • A set of rules to follow when pruning,
  • And a memory that can only hold so much.

The problem is the problem that needs to be solved. It is usually represented as a graph and has a set of nodes, each of which is a goal. Heuristic rules consist of rules specific to the problem domain and eliminate unnecessary nodes in memory.

The “beam” is stored in memory. If the memory is full and a node needs to be added to the bundle, the most expensive node will be removed so that the memory limit is not reached.


A beam search is commonly used to maintain traceability in large systems where there is insufficient memory to store the entire search tree.

  • It is used in many machine translation systems, for example. (The current state of the art mainly uses methods based on neural machine translation.)
  • Each part is processed to find the best translation, and many different ways to translate the words appear.

Moreover, the best translations are kept according to their sentence structures, while the others are discarded. The translator then evaluates the translations using a predefined criterion, selecting the translation that best meets the objectives. The Harpy voice recognition system, CMU 1976, was the first to use a beam search.


The combination of beam search and depth search has led to beam stack search and depth-first beam search. The combination of beam-seeking with divergence-limited search has led to beam-seeking using divergence-limited backtracking (BULB). The resulting search algorithms resemble beam search, which finds reasonable but likely suboptimal solutions, then goes back and continues to search for better solutions until they find the best one.

Also, local beam search often ends at local maxima, so a standard solution is to choose the next states at random, the probability depending on how heuristically evaluated them. The name of this type of search is “stochastic beam search”. In addition, flexible beam search and recovery beam search are two other types.

Source link

Comments are closed.