Skip to main content Accessibility help
×
Home

Fast circular dictionary-matching algorithm

  • TANVER ATHAR (a1), CARL BARTON (a1), WIDMER BLAND (a2), JIA GAO (a1), COSTAS S. ILIOPOULOS (a1), CHANG LIU (a1) and SOLON P. PISSIS (a1)...

Abstract

Circular string matching is a problem which naturally arises in many contexts. It consists in finding all occurrences of the rotations of a pattern of length m in a text of length n. There exist optimal worst- and average-case algorithms for circular string matching. Here, we present a suboptimal average-case algorithm for circular string matching requiring time $\mathcal{O}$ (n) and space $\mathcal{O}$ (m). The importance of our contribution is underlined by the fact that the proposed algorithm can be easily adapted to deal with circular dictionary matching. In particular, we show how the circular dictionary-matching problem can be solved in average-case time $\mathcal{O}$ (n + M) and space $\mathcal{O}$ (M), where M is the total length of the dictionary patterns, assuming that the shortest pattern is sufficiently long. Moreover, the presented average-case algorithms and other worst-case approaches were also implemented. Experimental results, using real and synthetic data, demonstrate that the implementation of the presented algorithms can accelerate the computations by more than a factor of two compared to the corresponding implementation of other approaches.

    • Send article to Kindle

      To send this article to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about sending to your Kindle. Find out more about sending to your Kindle.

      Note you can select to send to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be sent to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

      Find out more about the Kindle Personal Document Service.

      Fast circular dictionary-matching algorithm
      Available formats
      ×

      Send article to Dropbox

      To send this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Dropbox.

      Fast circular dictionary-matching algorithm
      Available formats
      ×

      Send article to Google Drive

      To send this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Google Drive.

      Fast circular dictionary-matching algorithm
      Available formats
      ×

Copyright

This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.

References

Hide All
Aho, A. V. and Corasick, M. J. (1975). Efficient string matching: An aid to bibliographic search. Communications of the ACM 18 (6) 333340.
Barton, C., Iliopoulos, C. S. and Pissis, S. P. (2013). Circular string matching revisited. In: Proceedings of the 4th Italian Conference on Theoretical Computer Science (ICTCS 2013) 200–205.
Barton, C., Iliopoulos, C. S. and Pissis, S. P. (2014). Fast algorithms for approximate circular string matching. Algorithms for Molecular Biology 9 (9). Available at http://www.almob.org/content/9/1/9.
Barton, C., Iliopoulos, C. S. and Pissis, S. P. (2015). Average-case optimal approximate circular string matching. In: Dediu, A.-H., Formenti, E., Martin-Vide, C. and Truthe, B. (eds.) Language and Automata Theory and Applications, Lecture Notes in Computer Science, volume 8977 Springer, Berlin 8596.
Belazzougui, D. (2010). Succinct dictionary matching with no slowdown. In: Amir, A. and Parida, L. (eds.) Combinatorial Pattern Matching, Lecture Notes in Computer Science, volume 6129 Springer, Berlin 88100.
Chan, H., Hon, W., Lam, T. and Sadakane, K. (2007). Compressed indexes for dynamic text collections. ACM Transactions on Algorithms 3 (2). Available at http://dl.acm.org/citation.cfm?doid=1240233.1240244.
Chen, K., Huang, G. and Lee, R. C. (2013). Bit-parallel algorithms for exact circular string matching. Computer Journal 57 (5) 731743.
Dori, S. and Landau, G. M. (2006). Construction of Aho Corasick automaton in linear time for integer alphabets. Information Processing Letters 98 (2) 6672.
Fischer, J. (2011). Inducing the LCP-array. In: Dehne, F., Iacono, J. and Sack, J.-R. (eds.) Algorithms and Data Structures, Lecture Notes in Computer Science, volume 6844, Springer, Berlin 374385.
Fischer, J. and Heun, V. (2011). Space-efficient preprocessing schemes for range minimum queries on static arrays. SIAM Journal on Computing 40 (2) 465492.
Fredriksson, K. and Grabowski, S. (2009). Average-optimal string matching. Journal of Discrete Algorithms 7 (4) 579594.
Frousios, K., Iliopoulos, C. S., Mouchard, L., Pissis, S. P. and Tischler, G. (2010). REAL: An efficient REad ALigner for next generation sequencing reads. In: Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology, BCB 10, USA, ACM 154–159.
Gusfield, D. (1997). Algorithms on Strings, Trees and Sequences, Cambridge University Press.
Hon, W., Ku, T., Shah, R. and Thankachan, S. V. (2013). Space-efficient construction algorithm for the circular suffix tree. In Fischer, J. and Sanders, P. (eds.) Combinatorial Pattern Matching, Lecture Notes in Computer Science, volume 7922, Springer, Berlin 142152.
Hon, W., Ku, T., Shah, R., Thankachan, S. V. and Vitter, J. S. (2010). Faster compressed dictionary matching. In: Chavez, E. and Lonardi, S. (eds.) String Processing and Information Retrieval, Lecture Notes in Computer Science, volume 6393, Springer, Berlin 191200.
Hon, W., Lu, C., Shah, R. and Thankachan, S. V. (2011). Succinct indexes for circular patterns. In Asano, T., Nakano, S.-I., Okamoto, Y. and Watanabe, O (eds.) Algorithms and Computation, Lecture Notes in Computer Science, volume 7074, Springer, Berlin 673682.
Huynh, T. N. D., Hon, W., Lam, T. and Sung, W. (2006). Approximate string matching using compressed suffix arrays. Theoretical Computer Science 352 (1) 240249.
Ilie, L., Navarro, G. and Tinta, L. (2010). The longest common extension problem revisited and applications to approximate string searching. Journal of Discrete Algorithms 8 (4) 418428.
Iliopoulos, C. S. and Rahman, M. S. (2008). Indexing circular patterns. In: Nakano, S.-I. and Rahman, Md. S. (eds.) WALCOM: Algorithms and Computation, Lecture Notes in Computer Science, volume 4921, Springer, Berlin 4657.
Lothaire, M. (ed.) (2005). Applied Combinatorics on Words, Cambridge University Press.
Manber, U. and Myers, E. W. (1993). Suffix arrays: A new method for on-line string searches. SIAM Journal on Computing 22 (5) 935948.
Nong, G., Zhang, S. and Chan, W. H. (2009). Linear suffix array construction by almost pure induced-sorting. In: Storer, J. A. and Marcellin, M. W. (eds.) Proceedings of the 2009 Data Compression Conference, DCC 09, Washington, DC, USA, IEEE Computer Society 193–202.
Rivest, R. (1976). Partial-match retrieval algorithms. SIAM Journal on Computing 5 (1) 1950.
Smyth, B. (2003). Computing Patterns in Strings. Pearson, Addison-Wesley.
Weiner, P. (1973). Linear pattern matching algorithms. In: Proceedings of the 14th Annual Symposium on Switching and Automata Theory (SWAT 1973), IEEE Computer Society 1–11.
Wu, S. and Manber, U. (1992). Fast text searching: Allowing errors. Communications of the ACM 35 (10) 8391.

Related content

Powered by UNSILO

Fast circular dictionary-matching algorithm

  • TANVER ATHAR (a1), CARL BARTON (a1), WIDMER BLAND (a2), JIA GAO (a1), COSTAS S. ILIOPOULOS (a1), CHANG LIU (a1) and SOLON P. PISSIS (a1)...

Metrics

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed.