Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

A. Complexity of Functions and Order Analysis

Bibliography

[bib1] [ABJ82] S. G. Akl, D. T. Bernard, and R. J. Jordan. Design and implementation of a parallel tree search algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-4:192–203, 1982.

[bib2] [ACM91] ACM. Resources in Parallel and Concurrent Systems. ACM Press, New York, NY, <year>1991</year>.

[bib3] [ACS89a] A. Aggarwal, A. K. Chandra, and M. Snir. A model for hierarchical memory. Technical Report RC 15118 (No. 67337), IBM T. J. Watson Research Center, Yorktown Heights, NY, 1989.

[bib4] [ACS89b] A. Aggarwal, A. K. Chandra, and M. Snir. On communication latency in PRAM computations. Technical Report RC 14973 (No. 66882), IBM T. J. Watson Research Center, Yorktown Heights, NY, 1989.

[bib5] [ACS89c] A. Aggarwal, A. K. Chandra, and M. Snir. Communication complexity of PRAMs. Technical Report RC 14998 (64644), IBM T. J. Watson Research Center, Yorktown Heights, NY, 1989.

[bib6] [ADJ⁺91] A. Agarwal, G. D’Souza, K. Johnson, D. Kranz, J. Kubiatowicz, K. Kurihara, B. -H. Lim, G. Maa, D. Nussbaum, M. Parkin, and D. Yeung. The MIT alewife machine : A large-scale distributed-memory multiprocessor. In Proceedings of Workshop on Scalable Shared Memory Multiprocessors. Kluwer Academic, 1991.

[bib7] [AFKW90] I. Angus, G. C. Fox, J. Kim, and D. W. Walker. Solving Problems on Concurrent Processors: Software for Concurrent Processors: Volume II. Prentice-Hall, Englewood Cliffs, NJ, <year>1990</year>.

[bib8] [AG94] G. S. Almasi and A. Gottlieb. Highly Parallel Computing. Benjamin/Cummings, Redwood City, CA, <year>1994</year>. (Second Edition).

[bib9] [Aga89] A. Agarwal. Performance tradeoffs in multithreaded processors. Technical Report 89-566, Massachusetts Institute of Technology, Microsystems Program Office, Cambridge, MA, 1989.

[bib10] [Aga91] A. Agarwal. Performance tradeoffs in multithreaded processors. Technical report MIT/LCS/TR 501; VLSI memo no. 89-566, Laboratory for Computer Science, Massachusetts Institute of Technology, Cambridge, MA, 1991.

[bib11] [AGGM90] A. Averbuch, E. Gabber, B. Gordissky, and Y. Medan. A parallel FFT on an MIMD machine. Parallel Computing, 15:61–74, 1990.

[bib12] [Agh86] G. Agha. Actors: A Model of Concurrent Computation in Distributed Systems. MIT Press, Cambridge, MA, <year>1986</year>.

[bib13] [AHMP87] H. Alt, T. Hagerup, K. Mehlhorn, and F. P. Preparata. Deterministic simulation of idealized parallel computers on more realistic ones. SIAM Journal of Computing, 16(5):808–835, October 1987.

[bib14] [AHU74] A. V. Aho, J. E. Hopcroft, and J. D. Ullman. The Design and Analysis of Computer Algorithms. Addison-Wesley, Reading, MA, <year>1974</year>.

[bib15] [AJM88] D. P. Agrawal, V. K. Janakiram, and R. Mehrotra. A randomized parallel branch-and-bound algorithm. In Proceedings of the 1988 International Conference on Parallel Processing, 1988.

[bib16] [AK84] M. J. Atallah and S. R. Kosaraju. Graph problems on a mesh-connected processor array. Journal of ACM, 31(3):649–667, July 1984.

[bib17] [Akl85] S. G. Akl. Parallel Sorting Algorithms. Academic Press, San Diego, CA, <year>1985</year>.

[bib18] [Akl89] S. G. Akl. The Design and Analysis of Parallel Algorithms. Prentice-Hall, Englewood Cliffs, NJ, <year>1989</year>.

[bib19] [Akl97] S. G. Akl. Parallel Computation Models and Methods. Prentice-Hall, Englewood Cliffs, NJ, <year>1997</year>.

[bib20] [AKR89] S. Arvindam, V. Kumar, and V. N. Rao. Floorplan optimization on multiprocessors. In Proceedings of the 1989 International Conference on Computer Design, 1989. Also published as Technical Report ACT-OODS-241-89, Microelectronics and Computer Corporation, Austin, TX.

[bib21] [AKR90] S. Arvindam, V. Kumar, and V. N. Rao. Efficient parallel algorithms for search problems: Applications in VLSI CAD. In Proceedings of the Third Symposium on the Frontiers of Massively Parallel Computation, 1990.

[bib22] [AKRS91] S. Arvindam, V. Kumar, V. N. Rao, and V. Singh. Automatic test pattern generation on multiprocessors. Parallel Computing, 17, number 12:1323–1342, December 1991.

[bib23] [AKS83] M. Ajtai, J. Komlos, and E. Szemeredi. An O (n log n) sorting network. In Proceedings of the 15th Annual ACM Symposium on Theory of Computing, 1–9, 1983.

[bib24] [AL93] S. G. Akl and K. A. Lyons. Parallel Computational Geometry. Prentice-Hall, Englewood Cliffs, NJ, <year>1993</year>.

[bib25] [AM88] T. S. Abdelrahman and T. N. Mudge. Parallel branch-and-bound algorithms on hypercube multiprocessors. In Proceedings of the Third Conference on Hypercubes, Concurrent Computers, and Applications, 1492–1499, New York, NY, 1988. ACM Press.

[bib26] [Amd67] G. M. Amdahl. Validity of the single processor approach to achieving large scale computing capabilities. In AFIPS Conference Proceedings, 483–485, 1967.

[bib27] [And91] G. R. Andrews. Concurrent Programming: Principles and Practice. Benjamin/Cummings, Redwood City, CA, <year>1991</year>.

[bib28] [AOB93] B. Abali, F. Ozguner, and A. Bataineh. Balanced parallel sort on hypercube multiprocessors. IEEE Transactions on Parallel and Distributed Systems, 4(5):572–581, May 1993.

[bib29] [AS87] B. Awerbuch and Y. Shiloach. New connectivity and MSF algorithms for shuffle-exchange network and PRAM. IEEE Transactions on Computers, C– 36(10):1258–1263, October 1987.

[bib30] [AU72] A. V. Aho and J. D. Ullman. The Theory of Parsing, Translation and Compiling: Volume 1, Parsing. Prentice-Hall, Englewood Cliffs, NJ, <year>1972</year>.

[bib31] [B⁺97] L. S. Blackford et al. ScaLAPACK Users’ Guide. SIAM, <year>1997</year>.

[bib32] [BA82] M. Ben-Ari. Principles of Concurrent Programming. Prentice-Hall, Englewood Cliffs, NJ, <year>1982</year>.

[bib33] [Bab88] R. G. Babb. Programming Parallel Processors. Addison-Wesley, Reading, MA, <year>1988</year>.

[bib34] [Bai90] D. H. Bailey. FFTs in external or hierarchical memory. Journal of Supercomputing, 4:23–35, 1990.

[bib35] [Bar68] G. H. Barnes. The ILLIAC IV computer. IEEE Transactions on Computers, C-17(8):746–757, 1968.

[bib36] [Bat68] K. E. Batcher. Sorting networks and their applications. In Proceedings of the 1968 Spring Joint Computer Conference , 307–314, 1968.

[bib37] [Bat76] K. E. Batcher. The Flip network in STARAN. In Proceedings of International Conference on Parallel Processing, 65–71, 1976.

[bib38] [Bat80] K. E. Batcher. Design of a massively parallel processor. IEEE Transactions on Computers, 836–840, September 1980.

[bib39] [Bau78] G. M. Baudet. The Design and Analysis of Algorithms for Asynchronous Multiprocessors. Ph.D. Thesis, Carnegie-Mellon University, Pittsburgh, PA, <year>1978</year>.

[bib40] [BB90] K. P. Belkhale and P. Banerjee. Approximate algorithms for the partitionable independent task scheduling problem. In Proceedings of the 1990 International Conference on Parallel Processing, I72–I75, 1990.

[bib41] [BBN89] BBN Advanced Computers Inc. TC-2000 Technical Product Summary. Cambridge, MA. <year>1989</year>.

[bib42] [BCCL95] R. E. Bixby, W. Cook, A. Cox, and E. K. Lee. Parallel mixed integer programming. Technical Report CRPC TR 95554, Center for Research on Parallel Computation, Research Monograph, 1995.

[bib43] [BCJ90] E. C. Bronson, T. L. Casavant, and L. H. Jamieson. Experimental application-driven architecture analysis of an SIMD/MIMD parallel processing system. IEEE Transactions on Parallel and Distributed Systems, 1(2):195–205, 1990.

[bib44] [BDHM84] D. Bitton, D. J. DeWitt, D. K. Hsiao, and M. J. Menon. A taxonomy of parallel sorting. Computing Surveys, 16(3):287–318, September 1984.

[bib45] [Bel57] R. Bellman. Dynamic Programming. Princeton University Press, Princeton, NJ, <year>1957</year>.

[bib46] [Bel58] R. Bellman. On a routing problem. Quarterly of Applied Mathematics, 16(1):87–90, 1958.

[bib47] [Ben80] J. L. Bentley. A parallel algorithm for constructing minimum spanning trees. Journal of the ACM, 27(1):51–59, March 1980.

[bib48] [Ber84] S. Berkowitz. On computing the determinant in small parallel time using a small number of processors. Information Processing Letters, 18(3):147–150, March 1984.

[bib49] [Ber89] J. Berntsen. Communication efficient matrix multiplication on hypercubes. Parallel Computing, 12:335–342, 1989.

[bib50] [BH82] A. Borodin and J. E. Hopcroft. Routing merging and sorting on parallel models of computation. In Proceedings of the 14th Annual ACM Symposium on Theory of Computing, 338–344, May 1982.

[bib51] [Bix91] R. E. Bixby. Two applications of linear programming. In Proceedings of the Workshop on Parallel Computing of Discrete Optimization Problems, 1991.

[bib52] [BJK⁺95] R. Blumofe, C. Joerg, B. Kuszmaul, C. Leiserson, K. Randall, and Y. Zhou. Cilk: An efficient multithreaded runtime system. In Proceedings of the 5th Symposium on Principles and Practice of Parallel Programming, 1995.

[bib53] [BKH89] S. Bershader, T. Kraay, and J. Holland. The giant-Fourier-transform. In Proceedings of the Fourth Conference on Hypercubes, Concurrent Computers, and Applications: Volume I, 387–389, 1989.

[bib54] [Ble90] G. E. Blelloch. Vector Models for Data-Parallel Computing. MIT Press, Cambridge, MA, <year>1990</year>.

[bib55] [BMCP98] A. Brungger, A. Marzetta, J. Clausen, and M. Perregaard. Solving large-scale qap problems in parallel with the search library zram. Journal of Parallel and Distributed Computing, 50:157–169, 1998.

[bib56] [BNK92] A. Bar-Noy and S. Kipnis. Designing broadcasting algorithms in the postal model for message-passing systems. In Proceedings of 4th ACM Symposium on Parallel Algorithms and Architectures, 13–22, 1992.

[bib57] [BOS⁺91] D. P. Bertsekas, C. Ozveren, G. D. Stamoulis, P. Tseng, and J. N. Tsitsiklis. Optimal communication algorithms for hypercubes. Journal of Parallel and Distributed Computing, 11:263–275, 1991.

[bib58] [BR90] R. Boppana and C. S. Raghavendra. On optimal and practical routing methods for a massive data movement operation on hypercubes. Technical report, University of Southern California, Los Angeles, CA, 1990.

[bib59] [Bra97] R. Bramley. Technology news & reviews: Chemkin software; OpenMP Fortran Standard; ODE toolbox for Matlab; Java products; Scientific WorkPlace 3.0. IEEE Computational Science and Engineering, 4(4):75–78, October/December 1997.

[bib60] [Bro79] K. Brown. Dynamic programming in computer science. Technical Report CMU-CS-79-106, Carnegie Mellon University, Pittsburgh, PA, 1979.

[bib61] [BS78] G. M. Baudet and D. Stevenson. Optimal sorting algorithms for parallel computers. IEEE Transactions on Computers, C–27(1):84–87, January 1978.

[bib62] [BT89] D. P. Bertsekas and J. N. Tsitsiklis. Parallel and Distributed Computation: Numerical Methods. Prentice-Hall, NJ, <year>1989</year>.

[bib63] [BT97] D. P. Bertsekas and J. N. Tsitsiklis. Parallel and Distributed Computation: Numerical Methods. Athena Scientific, <year>1997</year>.

[bib64] [But97] D. R. Butenhof. Programming with POSIX Threads. Addison-Wesley, Reading, MA, <year>1997</year>.

[bib65] [Buy99] R.Buyya, editor. High Performance Cluster Computing: Architectures and Systems. Prentice Hall, <year>1999</year>.

[bib66] [BW89] M. L. Barton and G. R. Withers. Computing performance as a function of the speed, quantity, and the cost of processors. In Supercomputing ’89 Proceedings, 759–764, 1989.

[bib67] [BW97] J. Beveridge and R. Wiener. Multithreading Applications in Win32: the Complete Guide to Threads. Addison-Wesley Developers Press, Reading, MA, <year>1997</year>.

[bib68] [C⁺95] J. Choi et al. A proposal for a set of Parallel Basic Linear Algebra Subprograms. Technical Report CS-95-292, Computer Science Department, University of Tennessee, 1995.

[bib69] [CAHH91] N. P. Chrisopchoides, M. Aboelaze, E. N. Houstis, and C. E. Houstis. The parallelization of some level 2 and 3 BLAS operations on distributed-memory machines. In Proceedings of the First International Conference of the Austrian Center of Parallel Computation. Springer-Verlag Series Lecture Notes in Computer Science, 1991.

[bib70] [Can69] L. E. Cannon. A cellular computer to implement the Kalman Filter Algorithm. Ph.D. Thesis, Montana State University, Bozman, MT, <year>1969</year>.

[bib71] [Car89] G.F.Carey, editor. Parallel Supercomputing: Methods, Algorithms and Applications. Wiley, New York, NY, <year>1989</year>.

[bib72] [CD87] S. Chandran and L. S. Davis. An approach to parallel vision algorithms. In R.Porth, editor, Parallel Processing. SIAM, Philadelphia, PA, <year>1987</year>.

[bib73] [CDK⁺00] R.Chandra, L.Dagum, D.Kohr, D.Maydan, J.McDonald, and R.M. (editors). Parallel Programming in OpenMP. Morgan Kaufmann Publishers, <year>2000</year>.

[bib74] [CG87] E. Chu and A. George. Gaussian elimination with partial pivoting and load balancing on a multiprocessor. Parallel Computing, 5:65–74, 1987.

[bib75] [CGK93] D. Challou, M. Gini, and V. Kumar. Parallel search algorithms for robot motion planning. In Proceedings of the IEEE Conference on Robotics and Automation, 46–51, 1993.

[bib76] [CGL92] D. E. Culler, M. Gunter, and J. C. Lee. Analysis of multithreaded microprocessors under multiprogramming. Report UCB/CSD 92/687, University of California, Berkeley, Computer Science Division, Berkeley, CA, May 1992.

[bib77] [Cha79] A. K. Chandra. Maximal parallelism in matrix multiplication. Technical Report RC-6193, IBM T. J. Watson Research Center, Yorktown Heights, NY, 1979.

[bib78] [Cha87] R. Chamberlain. An alternate view of LU factorization on a hypercube multiprocessor. In M.T.Heath, editor, Hypercube Multiprocessors 1987, 569–575. SIAM, Philadelphia, PA, <year>1987</year>.

[bib79] [CJP83] H. Crowder, E. L. Johnson, and M. Padberg. Solving large-scale zero-one linear programming problem. Operations Research, 2:803–834, 1983.

[bib80] [CKP⁺93a] D. Culler, R. Karp, D. Patterson, A. Sahay, K. Schauser, E. Santos, R. Subramonian, and T. von Eicken. LogP: Towards a realistic model of parallel computation. In Proceedings of the Fourth ACM SIGPLAN Symposium on Principles and Practices of Parallel Programming, 1–12, 1993.

[bib81] [CKP⁺93b] D. E. Culler, R. Karp, D. A. Patterson, et al. Logp: Towards a realistic model of parallel computation. In Principles and Practices of Parallel Programming, <year>May 1993</year>.

[bib82] [CL93] B. Codenotti and M. Leoncini. Introduction to Parallel Processing. Addison-Wesley, <year>1993</year>.

[bib83] [CLC81] F. Y. Chin, J. Lam, and I. Chen. Optimal parallel algorithms for the connected component problem. In Proceedings of the 1981 International Conference on Parallel Processing, 170–175, 1981.

[bib84] [CLC82] F. Y. Chin, J. Lam, and I. Chen. Efficient parallel algorithms for some graph problems. Communications of the ACM, 25(9):659–665, September 1982.

[bib85] [CLR90] T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Introduction to Algorithms. MIT Press, McGraw-Hill, New York, NY, <year>1990</year>.

[bib86] [CM82] K. M. Chandy and J. Misra. Distributed computation on graphs: Shortest path algorithms. Communications of the ACM, 25(11):833–837, November 1982.

[bib87] [CM98] B. Chapman and P. Mehrotra. OpenMP and HPF: Integrating two paradigms. Lecture Notes in Computer Science, 1470, <year>1998</year>.

[bib88] [Col88] R. Cole. Parallel merge sort. SIAM Journal on Computing, 17(4):770–785, August 1988.

[bib89] [Col89] M. Cole. Algorithmic Skeletons: Structured Management of Parallel Computation. MIT Press, Cambridge, MA, <year>1989</year>.

[bib90] [Con89] T. Conlon. Programming in PARLOG. Addison-Wesley, Reading, MA, <year>1989</year>.

[bib91] [CR89] E. A. Carmona and M. D. Rice. A model of parallel performance. Technical Report AFWL-TR-89-01, Air Force Weapons Laboratory, 1989.

[bib92] [CR91] E. A. Carmona and M. D. Rice. Modeling the serial and parallel fractions of a parallel algorithm. Journal of Parallel and Distributed Computing, 1991.

[bib93] [CS88] B. V. Cherkassky and R. Smith. Efficient mapping and implementations of matrix algorithms on a hypercube. Journal of Supercomputing, 2:7–27, 1988.

[bib94] [CSG98] D. E. Culler, J. P. Singh, and A. Gupta. Parallel Computer Architecture: A Hardware/Software Approach. Morgan Kaufmann, <year>1998</year>.

[bib95] [CT92] K. M. Chandy and S. Taylor. An Introduction to Parallel Programming. Jones and Bartlett, Austin, TX, <year>1992</year>.

[bib96] [CV91] B. S. Chlebus and I. Vrto. Parallel quicksort. Journal of Parallel and Distributed Processing, 1991.

[bib97] [Cve87] Z. Cvetanovic. Performance analysis of the FFT algorithm on a shared-memory parallel architecture. IBM Journal of Research and Development, 31(4):435–451, 1987.

[bib98] [CWP98] A. Cohen, M. Woodring, and R. Petrusha. Win32 Multithreaded Programming. O’Reilly & Associates, <year>1998</year>.

[bib99] [D⁺92] W. J. Dally et al. The message-driven processor. IEEE Micro, 12(2):23–39, 1992.

[bib100] [Dal87] W. J. Dally. A VLSI Architecture for Concurrent Data Structures. Kluwer Academic Publishers, Boston, MA, <year>1987</year>.

[bib101] [Dal90a] W. J. Dally. Analysis of k-ary n-cube interconnection networks. IEEE Transactions on Computers, 39(6), June 1990.

[bib102] [Dal90b] W. J. Dally. Network and processor architecture for message-driven computers. In R.Sauya and G.Birtwistle, editors, VLSI and Parallel Computation. Morgan Kaufmann, San Mateo, CA, <year>1990</year>.

[bib103] [Dav86] G. J. Davis. Column LU factorization with pivoting on a hypercube multiprocessor. SIAM Journal on Algebraic and Discrete Methods, 7:538–550, 1986. Also available as Technical Report ORNL-6219, Oak Ridge National Laboratory, Oak Ridge, TN, 1985.

[bib104] [DCG90] J. D. DeMello, J. L. Calvet, and J. M. Garcia. Vectorization and multitasking of dynamic programming in control: experiments on a CRAY-2. Parallel Computing, 13:261–269, 1990.

[bib105] [DDSV99] J. Dongarra, I. S. Duff, D. Sorensen, and H. V. Vorst. Numerical Linear Algebra for High Performance Computers (Software, Environments, Tools). SIAM, <year>1999</year>.

[bib106] [DeC89] A. L. DeCegama. The Technology of Parallel Processing: Parallel Processing Architectures and VLSI Hardware: Volume 1. Prentice-Hall, Englewood Cliffs, NJ, <year>1989</year>.

[bib107] [DEH89] P. M. Dew, R. A. Earnshaw, and T. R. Heywood. Parallel Processing for Computer Vision and Display. Addison-Wesley, Reading, MA, <year>1989</year>.

[bib108] [Dem82] J. Deminet. Experiences with multiprocessor algorithms. IEEE Transactions on Computers, C-31(4):278–288, 1982.

[bib109] [DFHM82] D. J. DeWitt, D. B. Friedland, D. K. Hsiao, and M. J. Menon. A taxonomy of parallel sorting algorithms. Technical Report TR-482, Computer Sciences Department, University of Wisconsin, Madison, WI, 1982.

[bib110] [DFRC96] F. Dehne, A. Fabri, and A. Rau-Chaplin. Scalable parallel computational geometry for coarse grained multicomputers. International Journal on Computational Geometry, 6(3):379–400, 1996.

[bib111] [DHvdV93] J. W. Demmel, M. T. Heath, and H. A. van der Vorst. Parallel numerical linear algebra. Acta Numerica, 111–197, 1993.

[bib112] [Dij59] E. W. Dijkstra. A note on two problems in connection with graphs. Numerische Mathematik, 1:269–271, 1959.

[bib113] [DM93] S. Dutt and N. R. Mahapatra. Parallel A* algorithms and their performance on hypercube multiprocessors. In Proceedings of the Seventh International Parallel Processing Symposium, 797–803, 1993.

[bib114] [DM98] L. Dagum and R. Menon. OpenMP: An industry-standard API for shared-memory programming. IEEE Computational Science and Engineering, 5(1):46–55, January/March 1998.

[bib115] [DNS81] E. Dekel, D. Nassimi, and S. Sahni. Parallel matrix and graph algorithms. SIAM Journal on Computing, 10:657–673, 1981.

[bib116] [Dra96] D. G. Drake. Introduction to Java threads. JavaWorld: IDG’s magazine for the Java community, 1(2), April 1996.

[bib117] [DRGNP] F. Darema-Rogers, D. George, V. Norton, and G. Pfister. VM parallel environment. In Proceedings of the IBM Kingston Parallel Processing Symposium.

[bib118] [DS86] W. J. Dally and C. L. Seitz. The torus routing chip. Journal of Distributed Computing, 1(3):187–196, 1986.

[bib119] [DS87] W. J. Dally and C. L. Seitz. Deadlock-free message routing in multiprocessor interconnection networks. IEEE Transactions on Computers, C-36(5):547– 553, 1987.

[bib120] [DSG83] E. W. Dijkstra, W. H. Seijen, and A. J. M. V. Gasteren. Derivation of a termination detection algorithm for a distributed computation. Information Processing Letters, 16(5):217–219, 1983.

[bib121] [DT89] L. Desbat and D. Trystram. Implementing the discrete Fourier transform on a hypercube vector-parallel computer. In Proceedings of the Fourth Conference on Hypercubes, Concurrent Computers, and Applications: Volume I, 407– 410, 1989.

[bib122] [DV87] K. A. Doshi and P. J. Varman. Optimal graph algorithms on a fixed-size linear array. IEEE Transactions on Computers, C–36(4):460–470, April 1987.

[bib123] [dV89] E. F. V. de Velde. Multicomputer matrix computations: Theory and practice. In Proceedings of the Fourth Conference on Hypercubes, Concurrent Computers, and Applications, 1303–1308, 1989.

[bib124] [DY81] N. Deo and Y. B. Yoo. Parallel algorithms for the minimum spanning tree problem. In Proceedings of the 1981 International Conference on Parallel Processing, 188–189, 1981.

[bib125] [Eck94] J. Eckstein. Parallel branch-and-bound methods for mixed-integer programming on the cm-5. SIAM Journal on Optimization, 4(4):794–814, 1994.

[bib126] [Eck97] J. Eckstein. Distributed versus centralized storage and control for parallel branch and bound: Mixed integer programming on the cm-5. Computational Optimization and Applications, 7(2):199–220, 1997.

[bib127] [Ede89] A. Edelman. Optimal matrix transposition and bit-reversal on hypercubes: Node address–memory address exchanges. Technical report, Thinking Machines Corporation, Cambridge, MA, 1989.

[bib128] [EDH80] O. I. El-Dessouki and W. H. Huen. Distributed enumeration on network computers. IEEE Transactions on Computers, C-29:818–825, September 1980.

[bib129] [EHHR88] S. C. Eisenstat, M. T. Heath, C. S. Henkel, and C. H. Romine. Modified cyclic algorithms for solving triangular systems on distributed-memory multiprocessors. SIAM Journal on Scientific and Statistical Computing, 9(3):589–600, 1988.

[bib130] [EHMN90] M. Evett, J. Hendler, A. Mahanti, and D. Nau. PRA*: A memory-limited heuristic search procedure for the connection machine. In Proceedings of the Third Symposium on the Frontiers of Massively Parallel Computation, 145– 149, 1990.

[bib131] [Ekl72] J. O. Eklundh. A fast computer method for matrix transposing. IEEE Transactions on Computers, 21(7):801–803, 1972.

[bib132] [Ert92] W. Ertel. OR—parallel theorem proving with random competition. In A.Voronokov, editor, LPAR ’92: Logic Programming and Automated Reasoning, 226–237. Springer-Verlag, New York, NY, <year>1992</year>.

[bib133] [EZL89] D. L. Eager, J. Zahorjan, and E. D. Lazowska. Speedup versus efficiency in parallel systems. IEEE Transactions on Computers, 38(3):408–423, 1989.

[bib134] [Fen81] T. Y. Feng. A survey of interconnection networks. IEEE Computer, 12–27, December 1981.

[bib135] [FF82] R. A. Finkel and J. P. Fishburn. Parallelism in alpha-beta search. Artificial Intelligence, 19:89–106, 1982.

[bib136] [FF86] G. C. Fox and W. Furmanski. Optimal communication algorithms on hypercube. Technical Report CCCP-314, California Institute of Technology, Pasadena, CA, 1986.

[bib137] [FJDS96] L. Fosdick, E. Jessup, G. Domik, and C. Schauble. Introduction to High-Performance Scientific Computing. MIT Press, <year>1996</year>.

[bib138] [FJL⁺88] G. C. Fox, M. Johnson, G. Lyzenga, S. W. Otto, J. Salmon, and D. Walker.Solving Problems on Concurrent Processors: Volume 1. Prentice-Hall, Englewood Cliffs, NJ, <year>1988</year>.

[bib139] [FK88] C. Ferguson and R. Korf. Distributed tree search and its application to alpha-beta pruning. In Proceedings of the 1988 National Conference on Artificial Intelligence, 1988.

[bib140] [FK89] H. P. Flatt and K. Kennedy. Performance of parallel processors. Parallel Computing, 12:1–20, 1989.

[bib141] [FKO86] E. Felten, S. Karlin, and S. W. Otto. Sorting on a hypercube. Caltech/JPL, 1986. Hm 244.

[bib142] [Fla90] H. P. Flatt. Further applications of the overhead model for parallel systems. Technical Report G320-3540, IBM Corporation, Palo Alto Scientific Center, Palo Alto, CA, 1990.

[bib143] [Flo62] R. W. Floyd. Algorithm 97: Shortest path. Communications of the ACM, 5(6):345, June 1962.

[bib144] [Fly72] M. J. Flynn. Some computer organizations and their effectiveness. IEEE Transactions on Computers, C-21(9):948–960, 1972.

[bib145] [Fly95] M. J. Flynn. Computer Architecture: Pipelined and Parallel Processor Design. Jones and Bartlett, <year>1995</year>.

[bib146] [FM70] W. D. Frazer and A. C. McKellar. Samplesort: A sampling approach to minimal storage tree sorting. Journal of the ACM, 17(3):496–507, July 1970.

[bib147] [FM87] R. A. Finkel and U. Manber. DIB—a distributed implementation of backtracking. ACM Transactions on Programming Languages and Systems, 9(2):235–256, April 1987.

[bib148] [FM92] R. Frye and J. Myczkowski. Load balancing algorithms on the connection machine and their use in Monte-Carlo methods. In Proceedings of the Unstructured Scientific Computation on Multiprocessors Conference, 1992.

[bib149] [FMM94] R. Feldmann, P. Mysliwietz, and B. Monien. Studying overheads in massively parallel min/max-tree evaluation. In Proc. of the 6th ACM Symposium on Parallel Algorithms and Architectures, 94–103, 1994.

[bib150] [FOH87] G. C. Fox, S. W. Otto, and A. J. G. Hey. Matrix algorithms on a hypercube I: Matrix multiplication. Parallel Computing, 4:17–31, 1987.

[bib151] [Fos95] I. Foster. Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering. Addison-Wesley, <year>1995</year>.

[bib152] [Fou94] T. J. Fountain. Parallel Computing: Principles and Practice. Cambridge University Press, <year>1994</year>.

[bib153] [FR62] L. R. Ford and R. L. Rivest. Flows in Networks. Princeton University Press, Princeton, NJ, <year>1962</year>.

[bib154] [Fra93] M. Franklin. The multiscalar architecture. Technical Report CS-TR-1993-1196, University of Wisconsin, 1993.

[bib155] [FTI90] M. Furuichi, K. Taki, and N. Ichiyoshi. A multi-level load balancing scheme for OR-parallel exhaustive search programs on the Multi-PSI. In Proceedings of the Second ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 50–59, 1990.

[bib156] [FW78] S. Fortune and J. Wyllie. Parallelism in random access machines. In Proceedings of ACM Symposium on Theory of Computing, 114–118, 1978.

[bib157] [Gal95] B. O. Gallmeister. Posix. 4 : Programming for the Real World. O’Reilly & Associates, <year>1995</year>.

[bib158] [GBD⁺94] G. A. Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Manchek, and V. Sunderam. PVM: Parallel Virtual Machine. MIT Press, Cambridge, MA, <year>1994</year>.

[bib159] [Gei85] G. A. Geist. Efficient parallel LU factorization with pivoting on a hypercube multiprocessor. Technical Report ORNL-6211, Oak Ridge National Laboratory, Oak Ridge, TN, 1985.

[bib160] [GGK⁺83] A. Gottlieb, R. Grishman, C. P. Kruskal, K. P. McAuliffe, L. Rudolph, and M. Snir. The NYU Ultracomputer—designing a MIMD, shared memory parallel computer. IEEE Transactions on Computers, C–32(2):175–189, February 1983.

[bib161] [GGK93] A. Y. Grama, A. Gupta, and V. Kumar. Isoefficiency: Measuring the scalability of parallel algorithms and architectures. IEEE Parallel and Distributed Technology, 1(3):12–21, August 1993.

[bib162] [GH85] G. A. Geist and M. T. Heath. Parallel Cholesky factorization on a hypercube multiprocessor. Technical Report ORNL-6190, Oak Ridge National Laboratory, Oak Ridge, TN, 1985.

[bib163] [GH86] G. A. Geist and M. T. Heath. Matrix factorization on a hypercube multiprocessor. In M.T.Heath, editor, Hypercube Multiprocessors 1986, 161–180. SIAM, Philadelphia, PA, <year>1986</year>.

[bib164] [GH01] S. Goedecker and A. Hoisie. Performance Optimization of Numerically Intensive Codes. SIAM, <year>2001</year>.

[bib165] [Gib85] A. Gibbons. Algorithmic Graph Theory. Cambridge University Press, Cambridge, <year>1985</year>.

[bib166] [Gib89] P. B. Gibbons. A more practical PRAM model. In Proceedings of the 1989 ACM Symposium on Parallel Algorithms and Architectures, 158–168, 1989.

[bib167] [GK91] A. Gupta and V. Kumar. The scalability of matrix multiplication algorithms on parallel computers. Technical Report TR 91-54, Department of Computer Science, University of Minnesota, Minneapolis, MN, 1991. A short version appears in Proceedings of 1993 International Conference on Parallel Processing, pages III-115–III-119, 1993.

[bib168] [GK93a] A. Gupta and V. Kumar. Performance properties of large scale parallel systems. Journal of Parallel and Distributed Computing, 19:234–244, 1993. Also available as Technical Report TR 92-32, Department of Computer Science, University of Minnesota, Minneapolis, MN.

[bib169] [GK93b] A. Gupta and V. Kumar. The scalability of FFT on parallel computers. IEEE Transactions on Parallel and Distributed Systems, 4(8):922–932, August 1993. A detailed version is available as Technical Report TR 90-53, Department of Computer Science, University of Minnesota, Minneapolis, MN.

[bib170] [GKP92] A. Grama, V. Kumar, and P. M. Pardalos. Parallel processing of discrete optimization problems. In Encyclopaedia of Microcomputers. Marcel Dekker Inc., New York, <year>1992</year>.

[bib171] [GKR91] A. Y. Grama, V. Kumar, and V. N. Rao. Experimental evaluation of load balancing techniques for the hypercube. In Proceedings of the Parallel Computing ’91 Conference, 497–514, 1991.

[bib172] [GKRS96] A. Grama, V. Kumar, S. Ranka, and V. Singh. A³: A simple and asymptotically accurate model for parallel computation. In Proceedings of the Sixth Symposium on Frontiers of Massively Parallel Computing, Annapolis, MD, 1996.

[bib173] [GKS92] A. Gupta, V. Kumar, and A. H. Sameh. Performance and scalability of preconditioned conjugate gradient methods on parallel computers. Technical Report TR 92-64, Department of Computer Science, University of Minnesota, Minneapolis, MN, 1992. A short version appears in Proceedings of the Sixth SIAM Conference on Parallel Processing for Scientific Computing, pages 664–674, 1993.

[bib174] [GKT79] L. J. Guibas, H. T. Kung, and C. D. Thompson. Direct VLSI Implementation of Combinatorial Algorithms. In Proceedings of Conference on Very Large Scale Integration, California Institute of Technology, 509–525, 1979.

[bib175] [GL96a] G. H. Golub and C. V. Loan. Matrix Computations. The Johns Hopkins University Press, Baltimore, MD, <year>1996</year>.

[bib176] [GL96b] W. D. Gropp and E. Lusk. User’s Guide for mpich, a Portable Implementation of MPI. Mathematics and Computer Science Division, Argonne National Laboratory. ANL-96/6. <year>1996</year>.

[bib177] [GLDS96] W. Gropp, E. Lusk, N. Doss, and A. Skjellum. A high-performance, portable implementation of the MPI message passing interface standard. Parallel Computing, 22(6):789–828, September 1996.

[bib178] [GLS99] W. Gropp, E. Lusk, and A. Skjellum. Using MPI. MIT Press, <year>1999</year>. 2nd Edition.

[bib179] [GMB88] J. L. Gustafson, G. R. Montry, and R. E. Benner. Development of parallel methods for a 1024-processor hypercube. SIAM Journal on Scientific and Statistical Computing, 9(4):609–638, 1988.

[bib180] [GO93] G. H. Golub and J. M. Ortega. Scientific Computing: An Introduction with Parallel Computing. Academic Press, <year>1993</year>.

[bib181] [GPS90] K. A. Gallivan, R. J. Plemmons, and A. H. Sameh. Parallel algorithms for dense linear algebra computations . SIAM Review, 32(1):54–135, March 1990. Also appears in K. A. Gallivan et al. Parallel Algorithms for Matrix Computations. SIAM, Philadelphia, PA, <year>1990</year>.

[bib182] [GR88] G. A. Geist and C. H. Romine. LU factorization algorithms on distributed-memory multiprocessor architectures. SIAM Journal on Scientific and Statistical Computing, 9(4):639–649, 1988. Also available as Technical Report ORNL/TM-10383, Oak Ridge National Laboratory, Oak Ridge, TN, 1987.

[bib183] [GR90] A. Gibbons and W. Rytter. Efficient Parallel Algorithms. Cambridge University Press, Cambridge, UK, <year>1990</year>.

[bib184] [Gre91] S. Green. Parallel Processing for Computer Graphics. MIT Press, Cambridge, MA, <year>1991</year>.

[bib185] [GSNL98] W. Gropp, M. Snir, W. Nitzberg, and E. Lusk. MPI: The Complete Reference. MIT Press, <year>1998</year>.

[bib186] [GT88] A. V. Goldberg and R. E. Tarjan. A new approach to the maximum-flow problem. Journal of the ACM, 35(4):921–940, October 1988.

[bib187] [Gup87] A. Gupta. Parallelism in Production Systems. Morgan Kaufmann, Los Altos, CA, <year>1987</year>.

[bib188] [Gus88] J. L. Gustafson. Reevaluating Amdahl’s law. Communications of the ACM, 31(5):532–533, 1988.

[bib189] [Gus92] J. L. Gustafson. The consequences of fixed time performance measurement. In Proceedings of the 25th Hawaii International Conference on System Sciences: Volume III, 113–124, 1992.

[bib190] [HB84] K. Hwang and F. A. Briggs. Computer Architecture and Parallel Processing. McGraw-Hill, New York, NY, <year>1984</year>.

[bib191] [HB88] M. M. Huntbach and F. W. Burton. Alpha-beta search on virtual tree machines. Information Science, 44:3–17, 1988.

[bib192] [HCH95] F. -H. Hsu, M. S. Campbell, and A. J. Hoane. Deep Blue system overview. In Proceedings of the 1995 International Conference on Supercomputing, Barcelona, Spain, 240–244, 1995.

[bib193] [HCS79] D. S. Hirschberg, A. K. Chandra, and D. V. Sarwate. Computing connected components on parallel computers. Communications of the ACM, 22(8):461– 464, August 1979.

[bib194] [HD87] S. -R. Huang and L. S. Davis. A tight upper bound for the speedup of parallel best-first branch-and-bound algorithms. Technical report, Center for Automation Research, University of Maryland, College Park, MD, 1987.

[bib195] [HD89a] S. R. Huang and L. S. Davis. Parallel iterative a* search: An admissible distributed heuristic search algorithm. In Proceedings of the Eleventh International Joint Conference on Artificial Intelligence, 23–29, 1989.

[bib196] [HD89b] K. Hwang and D. DeGroot. Parallel Processing for Supercomputers and Artificial Intelligence. McGraw-Hill, New York, NY, <year>1989</year>.

[bib197] [HDM97] J. Hill, S. Donaldson, and A. McEwan. Installation and user guide for the oxford bsp toolset: User guide for the oxford bsp toolset (v1.3) implementation of bsplib. Technical report, Oxford University Computing Laboratory, 1997.

[bib198] [Hea85] M. T. Heath. Parallel Cholesky factorization in message-passing multiprocessor environments. Technical Report ORNL-6150, Oak Ridge National Laboratory, Oak Ridge, TN, 1985.

[bib199] [HG97] S. Hamilton and L. Garber. Deep Blue’s hardware-software synergy. IEEE Computer, 30(10):29–35, October 1997.

[bib200] [Hil85] W. D. Hillis. The Connection Machine. MIT Press, Cambridge, MA, <year>1985</year>.

[bib201] [Hil90] M. D. Hill. What is scalability? Computer Architecture News, 18(4), 1990.

[bib202] [Hip89] P. G. Hipes. Matrix multiplication on the JPL/Caltech Mark IIIfp hypercube. Technical Report C3P 746, Concurrent Computation Program, California Institute of Technology, Pasadena, CA, 1989.

[bib203] [Hir76] D. S. Hirschberg. Parallel algorithms for the transitive closure and connected component problem. In Proceedings of the 8th Annual ACM Symposium on the Theory of Computing, 55–57, 1976.

[bib204] [Hir78] D. S. Hirschberg. Fast parallel sorting algorithms. Communications of the ACM, 21(8):657–666, August 1978.

[bib205] [HJ87] C. -T. Ho and S. L. Johnsson. Spanning balanced trees in Boolean cubes. Technical Report YALEU/DCS/RR-508, Department of Computer Science, Yale University, New Haven, CT, 1987.

[bib206] [HJE91] C. -T. Ho, S. L. Johnsson, and A. Edelman. Matrix multiplication on hypercubes using full bandwidth and constant storage. In Proceedings of the 1991 International Conference on Parallel Processing, 447–451, 1991.

[bib207] [HK96] S. Hambrusch and A. Khokhar. C³: A parallel model for coarse-grained machines. Journal of Parallel and Distributed Computing, 32(2):139–154, February 1996.

[bib208] [HLM84] R. E. Hiromoto, O. M. Lubeck, and J. Moore. Experiences with the Denelcor HEP. Parallel Computing, 1(3–4):197–206, 1984.

[bib209] [HLV90] S. H. S. Huang, H. Liu, and V. Vishwanathan. A sub-linear parallel algorithm for some dynamic programming problems. In Proceedings of the 1990 International Conference on Parallel Processing, III–261–III–264, 1990.

[bib210] [HM80] D. K. Hsiao and M. J. Menon. Parallel record-sorting methods for hardware realization. Osu-cisrc-tr-80-7, Computer Science Information Department, Ohio State University, Columbus, OH, 1980.

[bib211] [HMT⁺96] H. Hum, O. Maquelin, K. Theobald, X. Tian, and G. Gao. A study of the earth-manna multithreaded system. Intl. J. of Par. Prog., 24:319–347, 1996.

[bib212] [HNR90] P. Heidelberger, A. Norton, and J. T. Robinson. Parallel quicksort using fetch-and-add. IEEE Transactions on Computers, C-39(1):133–138, January 1990.

[bib213] [Hoa62] C. A. R. Hoare. Quicksort. Computer Journal, 5:10–15, 1962.

[bib214] [HP89] S. W. Hornick and F. P. Preparata. Deterministic PRAM simulation with constant redundancy. In Proceedings of the 1989 ACM Symposium on Parallel Algorithms and Architectures, 103–109, 1989.

[bib215] [HQ91] P. J. Hatcher and M. J. Quinn. Data Parallel Programming. MIT Press, Cambridge, MA, <year>1991</year>.

[bib216] [HR88] M. T. Heath and C. H. Romine. Parallel solution of triangular systems on distributed-memory multiprocessors. SIAM Journal on Scientific and Statistical Computing, 9(3):558–588, 1988.

[bib217] [HR91] C. -T. Ho and M. T. Raghunath. Efficient communication primitives on circuit-switched hypercubes. In Sixth Distributed Memory Computing Conference Proceedings, 390–397, 1991.

[bib218] [HS78] E. Horowitz and S. Sahni. Fundamentals of Computer Algorithms. Computer Science Press, Rockville, MD, <year>1978</year>.

[bib219] [HS86] W. D. Hillis and G. L. Steele. Data parallel algorithms. Communications of the ACM, 29(12):1170–1183, 1986.

[bib220] [Hsu90] F. -H. Hsu. Large scale parallelization of alpha-beta search: An algorithmic and architectural study with computer chess. Technical report, Carnegie Mellon University, Pittsburgh, PA, 1990. Ph.D. Thesis.

[bib221] [Hua85] M. A. Huang. Solving some graph problems with optimal or near-optimal speedup on mesh-of-trees networks. In Proceedings of the 26th Annual IEEE Symposium on Foundations of Computer Science, 232–340, 1985.

[bib222] [HX98] K. Hwang and Z. Xu. Scalable Parallel Computing. McGraw-Hill, New York, NY, <year>1998</year>.

[bib223] [Hyd99] P. Hyde. Java Thread Programming. Sams, <year>1999</year>.

[bib224] [IPS91] O. H. Ibarra, T. C. Pong, and S. M. Sohn. Parallel recognition and parsing on the hypercube. IEEE Transactions on Computers, 40(6):764–770, June 1991.

[bib225] [IYF79] M. Imai, Y. Yoshida, and T. Fukumura. A parallel searching scheme for multiprocessor systems and its application to combinatorial problems. In Proceedings of the International Joint Conference on Artificial Intelligence, 416–418, 1979.

[bib226] [Jaj92] J. Jaja. An Introduction to Parallel Algorithms. Addison-Wesley, Reading, MA, <year>1992</year>.

[bib227] [JAM87] V. K. Janakiram, D. P. Agrawal, and R. Mehrotra. Randomized parallel algorithms for Prolog programs and backtracking applications. In Proceedings of the 1987 International Conference on Parallel Processing, 278–281, 1987.

[bib228] [JAM88] V. K. Janakiram, D. P. Agrawal, and R. Mehrotra. A randomized parallel backtracking algorithm. IEEE Transactions on Computers, C-37(12), 1988.

[bib229] [JGD87] L.H.Jamieson, D.B.Gannon, and R.J.Douglass, editors. The Characteristics of Parallel Algorithms. MIT Press, Cambridge, MA, <year>1987</year>.

[bib230] [JH88] S. L. Johnsson and C. -T. Ho. Matrix transposition on Boolean n-cube configured ensemble architectures. SIAM Journal on Matrix Analysis and Applications, 9(3):419–454, July 1988.

[bib231] [JH89] S. L. Johnsson and C. -T. Ho. Optimum broadcasting and personalized communication in hypercubes. IEEE Transactions on Computers, 38(9):1249– 1268, September 1989.

[bib232] [JH91] S. L. Johnsson and C. -T. Ho. Optimal all-to-all personalized communication with minimum span on Boolean cubes. In Sixth Distributed Memory Computing Conference Proceedings, 299–304, 1991.

[bib233] [JKFM89] S. L. Johnsson, R. Krawitz, R. Frye, and D. McDonald. A radix-2 FFT on the connection machine. Technical report, Thinking Machines Corporation, Cambridge, MA, 1989.

[bib234] [JNS97] L. Johnson, G. Nemhauser, and M. Savelsbergh. Progress in integer programming: An exposition. Technical report, School of Industrial and Systems Engineering, Georgia Institute of Technology, 1997. Available from http://akula.isye.gatech.edu/mwps/mwps.html.

[bib235] [Joh77] D. B. Johnson. Efficient algorithms for shortest paths in sparse networks. Journal of the ACM, 24(1):1–13, March 1977.

[bib236] [Joh84] S. L. Johnsson. Combining parallel and sequential sorting on a boolean n-cube. In Proceedings of International Conference on Parallel Processing, 1984.

[bib237] [Joh87] S. L. Johnsson. Communication efficient basic linear algebra computations on hypercube architectures. Journal of Parallel and Distributed Computing, 4(2):133–172, April 1987.

[bib238] [Joh90] S. L. Johnsson. Communication in network architectures. In R.Suaya and G.Birtwistle, editors, VLSI and Parallel Computation, 223–389. Morgan Kaufmann, San Mateo, CA, <year>1990</year>.

[bib239] [JP93] M. T. Jones and P. E. Plassmann. A parallel graph coloring heuristic. SIAM Journal on Scientific Computing, 14:654–669, 1993.

[bib240] [JS87] J. -F. Jenq and S. Sahni. All pairs shortest paths on a hypercube multiprocessor. In Proceedings of the 1987 International Conference on Parallel Processing, 713–716, 1987.

[bib241] [KA88] R. A. Kamin and G. B. Adams. Fast Fourier transform algorithm design and tradeoffs. Technical Report RIACS TR 88.18, NASA Ames Research Center, Moffet Field, CA, 1988.

[bib242] [KA99a] Y. -K. Kwok and I. Ahmad. Benchmarking and comparison of the task graph scheduling algorithms. Journal of Parallel and Distributed Computing, 59:381–422, 1999.

[bib243] [KA99b] Y. -K. Kwok and I. Ahmad. Static scheduling algorithms for allocating directed task graphs to multiprocessors. ACM Computing Surveys, 31(4):406– 471, 1999.

[bib244] [KB57] T. C. Koopmans and M. J. Beckmann. Assignment problems and the location of economic activities. Econometrica, 25:53–76, 1957.

[bib245] [Ken90] Kendall Square Research Corporation. KSR-1 Overview. Waltham, MA. <year>1990</year>.

[bib246] [KF90] A. H. Karp and H. P. Flatt. Measuring parallel processor performance. Communications of the ACM, 33(5):539–543, 1990.

[bib247] [KG94] V. Kumar and A. Gupta. Analyzing scalability of parallel algorithms and architectures. Journal of Parallel and Distributed Computing, 22(3):379–391, 1994. Also available as Technical Report TR 91-18, Department of Computer Science Department, University of Minnesota, Minneapolis, MN.

[bib248] [KGK90] V.Kumar, P.S.Gopalakrishnan, and L.N.Kanal, editors. Parallel Algorithms for Machine Intelligence and Vision. Springer-Verlag, New York, NY, <year>1990</year>.

[bib249] [KGR94] V. Kumar, A. Grama, and V. N. Rao. Scalable load balancing techniques for parallel computers. Journal of Parallel and Distributed Computing, 22(1):60– 79, July 1994.

[bib250] [KH67] R. M. Karp and M. H. Held. Finite state processes and dynamic programming. SIAM Journal of Applied Math, 15:693–718, 1967.

[bib251] [KH83] M. Kumar and D. S. Hirschberg. An efficient implementation of Batcher’s odd-even merge algorithm and its application in parallel sorting schemes. IEEE Transactions on Computers, C–32, March 1983.

[bib252] [KK79] P. Kermani and L. Kleinrock. Virtual cut-through: A new communication switching technique. Computer Networks, 3(4):267–286, 1979.

[bib253] [KK83] V. Kumar and L. N. Kanal. A general branch-and-bound formulation for understanding and synthesizing and/or tree search procedures. Artificial Intelligence, 21:179–198, 1983.

[bib254] [KK84] V. Kumar and L. N. Kanal. Parallel branch-and-bound formulations for and/or tree search. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI–6:768–778, 1984.

[bib255] [KK88a] L. N. Kanal and V. Kumar. Search in Artificial Intelligence. Springer-Verlag, New York, NY, <year>1988</year>.

[bib256] [KK88b] V. Kumar and L. N. Kanal. The CDP: A unifying formulation for heuristic search, dynamic programming, and branch-and-bound. In L.N.Kanal and V.Kumar, editors, Search in Artificial Intelligence, 1–27. Springer-Verlag, New York, NY, <year>1988</year>.

[bib257] [KK93] G. Karypis and V. Kumar. Efficient Parallel Mappings of a Dynamic Programming Algorithm. In Proceedings of 7th International Parallel Processing Symposium, number 563–568, 1993.

[bib258] [KK94] G. Karypis and V. Kumar. Unstructured tree search on simd parallel computers. Journal of Parallel and Distributed Computing, 22(3):379–391, September 1994.

[bib259] [KK99] G. Karypis and V. Kumar. Parallel multilevel k-way partitioning for irregular graphs. SIAM Review, 41(2):278–300, 1999.

[bib260] [KKKS94] L.N.Kanal, V.Kumar, H.Kitano, and C.Suttner, editors. Parallel Processing for Artificial Intelligence. North-Holland, Amsterdam, The Netherlands, <year>1994</year>.

[bib261] [KN91] K. Kimura and I. Nobuyuki. Probabilistic analysis of the efficiency of the dynamic load distribution. In Sixth Distributed Memory Computing Conference Proceedings, 1991.

[bib262] [Knu73] D. E. Knuth. The Art of Computer Programming: Sorting and Searching. Addison-Wesley, Reading, MA, <year>1973</year>.

[bib263] [Kor81] W. Kornfeld. The use of parallelism to implement a heuristic search. In Proceedings of the International Joint Conference on Artificial Intelligence, 575–580, 1981.

[bib264] [Kow88] J. S. Kowalik. Parallel Computation and Computers for Artificial Intelligence. Kluwer Academic Publishers, Boston, MA, <year>1988</year>.

[bib265] [KP92] C. Kaklamanis and G. Persiano. Branch-and-bound and backtrack search on mesh-connected arrays of processors. In Proceedings of Fourth Annual Symposium on Parallel Algorithms and Architectures, 118–126, 1992.

[bib266] [KR87a] V. K. P. Kumar and C. S. Raghavendra. Array processor with multiple broadcasting. Journal of Parallel and Distributed Computing, 173–190, 1987.

[bib267] [KR87b] V. Kumar and V. N. Rao. Parallel depth-first search, part II: Analysis. International Journal of Parallel Programming, 16(6):501–519, 1987.

[bib268] [KR88] R. M. Karp and V. Ramachandran. A survey of complexity of algorithms for shared-memory machines. Technical Report 408, University of California, Berkeley, 1988.

[bib269] [KR89] V. Kumar and V. N. Rao. Load balancing on the hypercube architecture. In Proceedings of the Fourth Conference on Hypercubes, Concurrent Computers, and Applications, 603–608, 1989.

[bib270] [KRR88] V. Kumar, K. Ramesh, and V. N. Rao. Parallel best-first search of state-space graphs: A summary of results. In Proceedings of the 1988 National Conference on Artificial Intelligence, 122–126, 1988.

[bib271] [KRS88] C. P. Kruskal, L. Rudolph, and M. Snir. A complexity theory of efficient parallel algorithms. Technical Report RC13572, IBM T. J. Watson Research Center, Yorktown Heights, NY, 1988.

[bib272] [Kru56] J. B. Kruskal. On the shortest spanning subtree of a graph and the traveling salesman problem. In Proceedings of the AMS, volume 7, 48–50, 1956.

[bib273] [KS88] J. Kuehn and B. Smith. The horizon supercomputing system: architecture and software. In Proceedings of Supercomputing Conference, 28–34, 1988.

[bib274] [KS91a] L. V. Kale and V. Saletore. Efficient parallel execution of IDA* on shared and distributed-memory multiprocessors. In Sixth Distributed Memory Computing Conference Proceedings, 1991.

[bib275] [KS91b] V. Kumar and V. Singh. Scalability of parallel algorithms for the all-pairs shortest path problem. Journal of Parallel and Distributed Computing, 13(2):124–138, October 1991. A short version appears in the Proceedings of the International Conference on Parallel Processing, 1990.

[bib276] [KSS95] S. Kleiman, D. Shah, and B. Smaalders. Programming with Threads. SunSoft Press, Mountainview, CA, <year>1995</year>.

[bib277] [KU86] A. R. Karlin and E. Upfal. Parallel hashing – an efficient implementation of shared memory. In Proceedings of 18th ACM Conference on Theory of Computing, 160–168, 1986.

[bib278] [Kun80] J. T. Kung. The structure of parallel algorithms. In M.Yovits, editor, Advances in Computing, 73–74. Academic Press, San Diego, CA, <year>1980</year>.

[bib279] [Kun86] H. T. Kung. Memory requirements for balanced computer architectures. In Proceedings of the 1986 IEEE Symposium on Computer Architecture, 49–54, 1986.

[bib280] [KV92] K. Kreeger and N. R. Vempaty. Comparison of meshes vs. hypercubes for data rearrangement. Technical Report UCF-CS-92-28, Department of Computer Science, University of Central Florida, Orlando, FL, 1992.

[bib281] [KZ88] R. M. Karp and Y. Zhang. A randomized parallel branch-and-bound procedure. In Proceedings of the ACM Annual Symposium on Theory of Computing, 290–300, 1988.

[bib282] [Law75] D. H. Lawrie. Access and alignment of data in an array processor. IEEE Transactions on Computers, C-24(1):1145–1155, 1975.

[bib283] [LB95a] B. Lewis and D. J. Berg. Threads Primer: A Guide to Multithreaded Programming. Prentice Hall PTR/Sun Microsystems Press, <year>1995</year>.

[bib284] [LB95b] B. -H. Lim and R. Bianchini. Limits on the performance benefits of multithreading and prefetching. Research report RC 20238 (89547), IBM T. J. Watson Research Center, Yorktown Heights, NY, October 1995.

[bib285] [LB97] B. Lewis and D. J. Berg. Multithreaded Programming with Pthreads. Prentice Hall PTR/Sun Microsystems Press, <year>1997</year>.

[bib286] [LB98] T. G. Lewis and D. Berg. Multithreaded Programming with PThreads. Sun Microsystems Press / Prentice Hall, <year>1998</year>.

[bib287] [LC88] G. -J. Li and T. Coleman. A parallel triangular solver for a hypercube multiprocessor. SIAM Journal on Scientific and Statistical Computing, 9:485–502, 1988.

[bib288] [LC89] G. -J. Li and T. Coleman. A new method for solving triangular systems on distributed memory message passing multiprocessors. SIAM Journal on Scientific and Statistical Computing, 10:382–396, 1989.

[bib289] [LD90] S. Lakshmivarahan and S. K. Dhall. Analysis and Design of Parallel Algorithms: Arithmetic and Matrix Problems. McGraw-Hill, New York, NY, <year>1990</year>.

[bib290] [LDP89] M. R. Leuze, L. W. Dowdy, and K. H. Park. Multiprogramming a distributed-memory multiprocessor. Concurrency: Practice and Experience, 1(1):19–33, September 1989.

[bib291] [Lea99] D. Lea. Concurrent Programming in Java, Second Edition: Design Principles and Patterns. Addison-Wesley, <year>1999</year>.

[bib292] [Lei83] F. T. Leighton. Parallel computation using mesh of trees. In Proceedings of International Workshop on Graph-Theoretic Concepts in Computer Science, 1983.

[bib293] [Lei85a] F. T. Leighton. Tight bounds on the complexity of parallel sorting. IEEE Transactions on Computers, C–34(4):344–354, April 1985.

[bib294] [Lei85b] C. E. Leiserson. Fat-trees: Universal networks for hardware efficient supercomputing. In Proceedings of the 1985 International Conference on Parallel Processing, 393–402, 1985.

[bib295] [Lei92] F. T. Leighton. Introduction to Parallel Algorithms and Architectures. Morgan Kaufmann, San Mateo, CA, <year>1992</year>.

[bib296] [LER92] T. G. Lewis and H. El-Rewini. Introduction to Parallel Computing. Prentice-Hall, Englewood Cliffs, NJ, <year>1992</year>.

[bib297] [Les93] B. P. Lester. The Art of Parallel Programming. Prentice-Hall, Englewood Cliffs, NJ, <year>1993</year>.

[bib298] [Lev87] S. P. Levitan. Measuring communications structures in parallel architectures and algorithms. In L.H.Jamieson, D.B.Gannon, and R.J.Douglass, editors, The Characteristics of Parallel Algorithms. MIT Press, Cambridge, MA, <year>1987</year>.

[bib299] [Lew91] D. A. Lewine. Posix Programmer’s Guide: Writing Portable Unix Programs with the Posix. 1 Standard. O’Reilly & Associates, <year>1991</year>.

[bib300] [LHZ98] H. Lu, C. Hu, and W. Zwaenepoel. OpenMP on networks of workstations. In SC ’98, High Performance Networking and Computing Conference, Orlando, Florida, 1998.

[bib301] [Lil92] D. J. Lilja. Architectural Alternatives for Exploiting Parallelism. IEEE Computer Society Press, Los Alamitos, CA, <year>1992</year>.

[bib302] [Lin83] G. Lindstrom. The key node method: A highly parallel alpha-beta algorithm. Technical Report 83-101, Computer Science Department, University of Utah, Salt Lake City, UT, 1983.

[bib303] [Lin92] Z. Lin. A distributed fair polling scheme applied to or-parallel logic programming. International Journal of Parallel Programming, 20(4), August 1992.

[bib304] [LK72] K. N. Levitt and W. T. Kautz. Cellular arrays for the solution of graph problems. Communications of the ACM, 15(9):789–801, September 1972.

[bib305] [LK85] D. B. Leifker and L. N. Kanal. A hybrid SSS*/alpha-beta algorithm for parallel search of game trees. In Proceedings of the International Joint Conference on Artificial Intelligence, 1044–1046, 1985.

[bib306] [LLG⁺92] D. Lenoski, J. Laudon, K. Gharachorloo, W. D. Weber, A. Gupta, J. L. Hennessy, M. Horowitz, and M. Lam. The Stanford dash multiprocessor. IEEE Computer, 63–79, March 1992.

[bib307] [LM97] E. K. Lee and J. E. Mitchell. Computational experience of an interior-point algorithm in a parallel branch-and-cut framework. In Proceedings for SIAM Conference on Parallel Processing for Scientific Computing, 1997.

[bib308] [LMR88] F. T. Leighton, B. Maggs, and S. K. Rao. Universal packet routing algorithms. In 29th Annual Symposium on Foundations of Computer Science, 256–271, 1988.

[bib309] [Loa92] C. V. Loan. Computational Frameworks for the Fast Fourier Transform. SIAM, Philadelphia, PA, <year>1992</year>.

[bib310] [LP92] Y. Li and P. M. Pardalos. Parallel algorithms for the quadratic assignment problem. In P.M.Pardalos, editor, Advances in Optimization and Parallel Computing, 177–189. North-Holland, Amsterdam, The Netherlands, <year>1992</year>.

[bib311] [LPP88] F. Luccio, A. Pietracaprina, and G. Pucci. A probabilistic simulation of PRAMs on a bounded degree network. Information Processing Letters, 28:141–147, July 1988.

[bib312] [LPP89] F. Luccio, A. Pietracaprina, and G. Pucci. A new scheme for deterministic simulation of PRAMs in VLSI. SIAM Journal of Computing, 1989.

[bib313] [LRZ95] C. Leiserson, K. Randall, and Y. Zhou. Cilk: An efficient multithreaded runtime system. In Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), Santa Barbara, CA, 1995.

[bib314] [LS84] T. H. Lai and S. Sahni. Anomalies in parallel branch and bound algorithms. Communications of the ACM, 594–602, 1984.

[bib315] [LS85] T. H. Lai and A. Sprague. Performance of parallel branch-and-bound algorithms. IEEE Transactions on Computers, C-34(10), October 1985.

[bib316] [LS86] T. H. Lai and A. Sprague. A note on anomalies in parallel branch-and-bound algorithms with one-to-one bounding functions. Information Processing Letters, 23:119–122, October 1986.

[bib317] [LSS88] J. Lee, E. Shragowitz, and S. Sahni. A hypercube algorithm for the 0/1 knapsack problem. Journal of Parallel and Distributed Computing, (5):438–456, 1988.

[bib318] [Lub86] M. Luby. A simple parallel algorithm for the maximal independent set problem. SIAM Journal on Computing, 15(4):1036–1053, 1986.

[bib319] [LW66] E. L. Lawler and D. Woods. Branch-and-bound methods: A survey. Operations Research, 14, 1966.

[bib320] [LW84] G. -J. Li and B. W. Wah. Computational efficiency of parallel approximate branch-and-bound algorithms. In Proceedings of the 1984 International Conference on Parallel Processing, 473–480, 1984.

[bib321] [LW85] G. -J. Li and B. W. Wah. Parallel processing of serial dynamic programming problems. In Proceedings of COMPSAC 85, 81–89, 1985.

[bib322] [LW86] G. -J. Li and B. W. Wah. Coping with anomalies in parallel branch-and-bound algorithms. IEEE Transactions on Computers, C-35, June 1986.

[bib323] [LW95] D. Lenoski and W. D. Weber. Scalable Shared-Memory Multiprocessing. Morgan Kaufmann, San Mateo, CA, <year>1995</year>.

[bib324] [LY86] M. Li and Y. Yesha. New lower bounds for parallel computations. In Proceedings of 18th ACM Conference on Theory of Computing, 177–187, 1986.

[bib325] [MC82] T. A. Marsland and M. S. Campbell. Parallel search of strongly ordered game trees. Computing Surveys, 14:533–551, 1982.

[bib326] [MD92] A. Mahanti and C. Daniels. SIMD parallel heuristic search. Artificial Intelligence, 1992.

[bib327] [MD93] N. R. Mahapatra and S. Dutt. Scalable duplicate pruning strategies for parallel A* graph search. In Proceedings of the Fifth IEEE Symposium on Parallel and Distributed Processing, 1993.

[bib328] [MdV87] O. A. McBryan and E. F. V. de Velde. Hypercube algorithms and implementations. SIAM Journal on Scientific and Statistical Computing, 8(2):s227–s287, March 1987.

[bib329] [Mes94] Message Passing Interface Forum. MPI: A Message-Passing Interface Standard. Available at http://www.mpi-forum.org. May 1994.

[bib330] [Mes97] Message Passing Interface Forum. MPI-2: Extensions to the Message-Passing Interface. Available at http://www.mpi-forum.org. July 1997.

[bib331] [MFMV90] B. Monien, R. Feldmann, P. Mysliwietz, and O. Vornberger. Parallel game tree search by dynamic tree decomposition. In V.Kumar, P.S.Gopalakrishnan, and L.N.Kanal, editors, Parallel Algorithms for Machine Intelligence and Vision. Springer-Verlag, New York, NY, <year>1990</year>.

[bib332] [MG89] C. U. Martel and D. Q. Gusfield. A fast parallel quicksort algorithm. Information Processing Letters, 30:97–102, 1989.

[bib333] [Mil91] D. Miller. Exact distributed algorithms for travelling salesman problem. In Proceedings of the Workshop on Parallel Computing of Discrete Optimization Problems, 1991.

[bib334] [MK99] J. Magee and J. Kramer. Concurrency: State Models and Java Programs. John Wiley & Sons, <year>1999</year>.

[bib335] [MKRS88] R. Miller, V. K. P. Kumar, D. I. Reisis, and Q. F. Stout. Meshes with reconfigurable buses. In Proceedings of MIT Conference on Advanced Research in VLSI, 163–178, 1988.

[bib336] [MM73] A. Martelli and U. Montanari. From dynamic programming to search algorithms with functional costs. In Proceedings of the International Joint Conference on Artifi cial Intelligence, 345–349, 1973.

[bib337] [MM91] P.Messina and A.Murli, editors. Practical Parallel Computing: Status and Prospects. Wiley, Chichester, UK, <year>1991</year>.

[bib338] [MMR95] B. Mans, T. Mautor, and C. Roucairol. A parallel depth first search branch and bound for the quadratic assignment problem. European Journal of Operational Research, 81(3):617–628, 1995.

[bib339] [Mod88] J. J. Modi. Parallel Algorithms and Matrix Computation. Oxford University Press, Oxford, UK, <year>1988</year>.

[bib340] [Moh83] J. Mohan. Experience with two parallel programs solving the traveling salesman problem. In Proceedings of the 1983 International Conference on Parallel Processing, 191–193, 1983.

[bib341] [Mol86] C. B. Moler. Matrix computation on distributed-memory multiprocessors. In M.T.Heath, editor, Hypercube Multiprocessors 1986, 181–195. SIAM, Philadelphia, PA, <year>1986</year>.

[bib342] [Mol87] C. B. Moler. Another look at Amdahl’s law. Technical Report TN-02-0587-0288, Intel Scientific Computers, 1987.

[bib343] [Mol93] D. I. Moldovan. Parallel Processing: From Applications to Systems. Morgan Kaufmann, San Mateo, CA, <year>1993</year>.

[bib344] [MP85] T. A. Marsland and F. Popowich. Parallel game tree search. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-7(4):442–452, July 1985.

[bib345] [MP93] D. L. Miller and J. F. Pekny. The role of performance metrics for parallel mathematical programming algorithms. ORSA Journal on Computing, 5(1), 1993.

[bib346] [MR] D. C. Marinescu and J. R. Rice. On high level characterization of parallelism. Technical Report CSD-TR-1011, CAPO Report CER-90-32, Computer Science Department, Purdue University, West Lafayette, IN. Also published in Journal of Parallel and Distributed Computing, 1993.

[bib347] [MRSR92] G. P. McKeown, V. J. Rayward-Smith, and S. A. Rush. Parallel Branch-and-Bound, 111–150. Advanced Topics in Computer Science. Blackwell Scientific Publications, Oxford, UK, <year>1992</year>.

[bib348] [MS88] Y. W. E. Ma and D. G. Shea. Downward scalability of parallel architectures. In Proceedings of the 1988 International Conference on Supercomputing, 109–120, 1988.

[bib349] [MS90] G. Manzini and M. Somalvico. Probabilistic performance analysis of heuristic search using parallel hash tables. In Proceedings of the International Symposium on Artificial Intelligence and Mathematics, 1990.

[bib350] [MS96] R. Miller and Q. F. Stout. Parallel Algorithms for Regular Architectures. MIT Press, Cambridge, MA, <year>1996</year>.

[bib351] [MV84] K. Mehlhorn and U. Vishkin. Randomized and deterministic simulations of PRAMs by parallel machines with restricted granularity of parallel memories. Acta Informatica, 21(4):339–374, November 1984.

[bib352] [MV85] B. Monien and O. Vornberger. The ring machine. Technical report, University of Paderborn, Germany, 1985. Also in Computers and Artificial Intelligence, 3(1987).

[bib353] [MV87] B. Monien and O. Vornberger. Parallel processing of combinatorial search trees. In Proceedings of International Workshop on Parallel Algorithms and Architectures, 1987.

[bib354] [MVS86] B. Monien, O. Vornberger, and E. Spekenmeyer. Superlinear speedup for parallel backtracking. Technical Report 30, University of Paderborn, Germany, 1986.

[bib355] [NA91] D. Nussbaum and A. Agarwal. Scalability of parallel machines. Communications of the ACM, 34(3):57–61, 1991.

[bib356] [Nat90] L. Natvig. Investigating the practical value of Cole’s O (log n) time crew pram merge sort algorithm. In 5th International Symposium on Computing and Information Sciences, October 1990.

[bib357] [NBF96] B. Nichols, B. Buttlar, and J. P. Farrell. Pthreads Programming. O’Reilly & Associates, Newton, MA 02164, <year>1996</year>.

[bib358] [nCU90] nCUBE Corporation. nCUBE 6400 Processor Manual. Beaverton, OR, <year>1990</year>.

[bib359] [ND96] S. J. Norton and M. D. DiPasquale. Thread time: the multithreaded programming guide. Hewlett-Packard professional books. Prentice-Hall, Englewood Cliffs, NJ 07632, <year>1996</year>.

[bib360] [Ni91] L. M. Ni. A layered classification of parallel computers. In Proceedings of 1991 International Conference for Young Computer Scientists, 28–33, 1991.

[bib361] [Nic90] J. R. Nickolls. The design of the MasPar MP-1: A cost-effective massively parallel computer. In IEEE Digest of Papers—Comcom, 25–28. IEEE Computer Society Press, Los Alamitos, CA, 1990.

[bib362] [NM93] L. M. Ni and McKinley. A survey of wormhole routing techniques in direct connect networks. IEEE Computer, 26(2), February 1993.

[bib363] [NMB83] D. Nath, S. N. Maheshwari, and P. C. P. Bhatt. Efficient VLSI networks for parallel processing based on orthogonal trees . IEEE Transactions on Computers, C–32:21–23, June 1983.

[bib364] [NS79] D. Nassimi and S. Sahni. Bitonic sort on a mesh connected parallel computer. IEEE Transactions on Computers, C–28(1), January 1979.

[bib365] [NS80] D. Nassimi and S. Sahni. Finding connected components and connected ones on a mesh-connected computer. SIAM Journal of Computing, 9(4):744–757, November 1980.

[bib366] [NS87] A. Norton and A. J. Silberger. Parallelization and performance analysis of the Cooley-Tukey FFT algorithm for shared memory architectures. IEEE Transactions on Computers, C-36(5):581–591, 1987.

[bib367] [NS93] M. Nigam and S. Sahni. Sorting n numbers on n × n reconfigurable meshes with buses. In 7th International Parallel Processing Symposium, 174–181, 1993.

[bib368] [NSF91] Grand Challenges: High Performance Computing and Communications. A Report by the Committee on Physical, Mathematical and Engineering Sciences, NSF/CISE, 1800 G Street NW, Washington, DC, 20550, <year>1991</year>.

[bib369] [Nug88] S. F. Nugent. The iPSC/2 direct-connect communications technology. In Proceedings of the Third Conference on Hypercubes, Concurrent Computers, and Applications, 51–60, 1988.

[bib370] [Nus82] H. J. Nussbaumer. Fast Fourier Transform and Convolution Algorithms. Springer-Verlag, New York, NY, <year>1982</year>.

[bib371] [NW88] D. M. Nicol and F. H. Willard. Problem size, parallel architecture, and optimal speedup. Journal of Parallel and Distributed Computing, 5:404–420, 1988.

[bib372] [GOV99] Funding a Revolution: Government Support for Computing Research. Committee on Innovations in Computing and Communications. National Academy Press, <year>1999</year>.

[bib373] [OR88] J. M. Ortega and C. H. Romine. The ijk forms of factorization methods II: Parallel systems. Parallel Computing, 7:149–162, 1988.

[bib374] [Ort88] J. M. Ortega. Introduction to Parallel and Vector Solution of Linear Systems. Plenum Press, New York, NY, <year>1988</year>.

[bib375] [OS85] D. P. O’Leary and G. W. Stewart. Data-flow algorithms for parallel matrix computations. Communications of the ACM, 28:840–853, 1985.

[bib376] [OS86] D. P. O’Leary and G. W. Stewart. Assignment and scheduling in parallel matrix factorization. Linear Algebra and its Applications, 77:275–299, 1986.

[bib377] [Pac98] P. Pacheco. Parallel Programming with MPI. Morgan Kaufmann, <year>1998</year>.

[bib378] [PBG⁺85] G. F. Pfister, W. C. Brantley, D. A. George, S. L. Harvey, W. J. Kleinfelder, K. P. McAuliffe, E. A. Melton, V. A. Norlton, and J. Weiss. The IBM research parallel processor prototype (RP3): Introduction and architecture. In Proceedings of 1985 International Conference on Parallel Processing, 764– 771, 1985.

[bib379] [PC89] P. M. Pardalos and J. Crouse. A parallel algorithm for the quadratic assignment problem. In Supercomputing ‘89 Proceedings, 351–360. ACM Press, New York, NY, 1989.

[bib380] [PD89] H. K. and L. W. Dowdy. Dynamic partitioning of multiprocessor systems. International Journal of Parallel Processing, 18(2):91–120, 1989.

[bib381] [Pea84] J. Pearl. Heuristics—Intelligent Search Strategies for Computer Problem Solving. Addison-Wesley, Reading, MA, <year>1984</year>.

[bib382] [Per87] R. Perrott. Parallel Programming. Addison-Wesley, Reading, MA, <year>1987</year>.

[bib383] [Pfi98] G. F. Pfister. In Search of Clusters. Prentice Hall, Englewood Cliffs, NJ, <year>1998</year>. 2nd Edition.

[bib384] [PFK90] C. Powley, C. Ferguson, and R. Korf. Parallel heuristic search: Two approaches. In V.Kumar, P. S. Gopalakrishnan, and L. N. Kanal, editors, Parallel Algorithms for Machine Intelligence and Vision. Springer-Verlag, New York, NY, <year>1990</year>.

[bib385] [PG98] T. Q. Pham and P. K. Garg. Multithreaded Programming with Win32. Prentice Hall, <year>1998</year>.

[bib386] [PH90] D. A. Patterson and J. L. Hennessy. Computer Architecture: A Quantitative Approach. Morgan Kaufmann, San Mateo, CA, <year>1990</year>.

[bib387] [PH96] D. A. Patterson and J. L. Hennessy. Computer Architecture: A Quantitative Approach, 2nd edition. Morgan Kaufmann, San Mateo, CA, <year>1996</year>.

[bib388] [PK89] R. C. Paige and C. P. Kruskal. Parallel algorithms for shortest path problems. In Proceedings of 1989 International Conference on Parallel Processing, 14– 19, 1989.

[bib389] [PK95] I. Pramanick and J. G. Kuhl. An inherently parallel method for heuristic problem-solving: Part I – general framework. IEEE Transactions on Parallel and Distributed Systems, 6(10), October 1995.

[bib390] [PKF92] C. Powley, R. Korf, and C. Ferguson. IDA* on the connection machine. Artificial Intelligence, 1992.

[bib391] [Pla89] C. C. Plaxton. Load balancing, selection and sorting on the hypercube. In Proceedings of the 1989 ACM Symposium on Parallel Algorithms and Architectures, 64–73, 1989.

[bib392] [PLRR94] P. M. Pardalos, Y. Li, K. Ramakrishna, and M. Resende. Lower bounds for the quadratic assignment problem. Annals of Operations Research, 50:387–411, <year>1994</year>. Special Volume on Applications of Combinatorial Optimization.

[bib393] [PR85] V. Pan and J. H. Reif. Efficient parallel solution of linear systems. In 17th Annual ACM Symposium on Theory of Computing, 143–152, 1985.

[bib394] [PR89] P. M. Pardalos and G. P. Rodgers. Parallel branch-and-bound algorithms for unconstrainted quadratic zero-one programming. In R.Sharda et al., editors, Impacts of Recent Computer Advances on Operations Research, 131–143. North-Holland, Amsterdam, The Netherlands, <year>1989</year>.

[bib395] [PR90] P. M. Pardalos and G. P. Rodgers. Parallel branch-and-bound algorithms for quadratic zero-one programming on a hypercube architecture. Annals of Operations Research, 22:271–292, <year>1990</year>.

[bib396] [PR91] M. Padberg and G. Rinaldi. A branch-and-cut algorithm for the resolution of large-scale symmetric traveling salesman problems. SIAM Review, 33:60– 100, 1991.

[bib397] [Pri57] R. C. Prim. Shortest connection network and some generalizations. Bell Systems Technical Journal, 36:1389–1401, 1957.

[bib398] [PRV88] G. Plateau, C. Roucairol, and I. Valabregue. Algorithm PR2 for the parallel size reduction of the 0/1 multiknapsack problem. In INRIA Rapports de Recherche, number 811, 1988.

[bib399] [PS82] C. H. Papadimitriou and K. Steiglitz. Combinatorial Optimization: Algorithms and Complexity. Prentice-Hall, Englewood Cliffs, NJ, <year>1982</year>.

[bib400] [PV80] F. P. Preparata and J. Vuillemin. Area-time optimal VLSI networks for matrix multiplication. In Proceedings of the 14th Princeton Conference on Information Science and Systems, 300–309, 1980.

[bib401] [PY88] C. H. Papadimitriou and M. Yannakakis. Towards an architecture independent analysis of parallel algorithms. In Proceedings of 20th ACM Symposium on Theory of Computing, 510–513, 1988.

[bib402] [QD86] M. J. Quinn and N. Deo. An upper bound for the speedup of parallel branch-and-bound algorithms. BIT, 26(1), March 1986.

[bib403] [Qui88] M. J. Quinn. Parallel sorting algorithms for tightly coupled multiprocessors. Parallel Computing, 6:349–357, 1988.

[bib404] [Qui89] M. J. Quinn. Analysis and implementation of branch-and-bound algorithms on a hypercube multicomputer. IEEE Transactions on Computers, 1989.

[bib405] [Qui94] M. J. Quinn. Parallel Computing: Theory and Practice. McGraw-Hill, New York, NY, <year>1994</year>.

[bib406] [Ram97] V. Ramachandran. Qsm: A general purpose shared-memory model for parallel computation. In Foundations of Software Technology and Theoretical Computer Science, 1–5, <year>1997</year>.

[bib407] [Ran89] A. G. Ranade. Fluent Parallel Computation. Ph.D. Thesis, Department of Computer Science, Yale University, New Haven, CT, <year>1989</year>.

[bib408] [Ran91] A. G. Ranade. Optimal speedup for backtrack search on a butterfly network. In Proceedings of the Third ACM Symposium on Parallel Algorithms and Architectures, 1991.

[bib409] [Rao90] V. N. Rao. Parallel Processing of Heuristic Search. Ph.D. Thesis, University of Texas, Austin, TX, <year>1990</year>.

[bib410] [Ras78] L. Raskin. Performance Evaluation of Multiple Processor Systems. Ph.D. Thesis, Carnegie-Mellon University, Pittsburgh, PA, <year>1978</year>.

[bib411] [RB76] C. M. Rader and N. M. Brenner. A new principle for Fast fourier transform. IEEE Transactions on Acoustics, Speech and Signal Processing, 24:264–265, 1976.

[bib412] [RDK89] C. Renolet, M. Diamond, and J. Kimbel. Analytical and heuristic modeling of distributed algorithms. Technical Report E3646, FMC Corporation, Advanced Systems Center, Minneapolis, MN, 1989.

[bib413] [Rei81] R. Reischuk. Probabilistic algorithms for sorting and selection. SIAM Journal of Computing, 396–409, 1981.

[bib414] [RF89] D. A. Reed and R. M. Fujimoto. Multicomputer Networks: Message-Based Parallel Processing. MIT Press, Cambridge, MA, <year>1989</year>.

[bib415] [RICN88] K. Rokusawa, N. Ichiyoshi, T. Chikayama, and H. Nakashima. An efficient termination detection and abortion algorithm for distributed processing systems. In Proceedings of 1988 International Conference on Parallel Processing: Vol. I, 18–22, 1988.

[bib416] [RK87] V. N. Rao and V. Kumar. Parallel depth-first search, part I: Implementation. International Journal of Parallel Programming, 16(6):479–499, 1987.

[bib417] [RK88a] V. N. Rao and V. Kumar. Concurrent access of priority queues. IEEE Transactions on Computers, C–37 (12), 1988.

[bib418] [RK88b] V. N. Rao and V. Kumar. Superlinear speedup in state-space search. In Proceedings of the 1988 Foundation of Software Technology and Theoretical Computer Science, number 338, 161–174. Springer-Verlag Series Lecture Notes in Computer Science, 1988.

[bib419] [RK93] V. N. Rao and V. Kumar. On the efficicency of parallel backtracking. IEEE Transactions on Parallel and Distributed Systems, 4(4):427–437, April 1993. available as a technical report TR 90-55, Computer Science Department, University of Minnesota.

[bib420] [RKR87] V. N. Rao, V. Kumar, and K. Ramesh. A parallel implementation of iterative-deepening-A*. In Proceedings of the National Conference on Artificial Intelligence (AAAI-87), 878–882, 1987.

[bib421] [RND77] E. M. Reingold, J. Nievergelt, and N. Deo. Combinatorial Algorithms: Theory and Practice. Prentice-Hall, Englewood Cliffs, NJ, <year>1977</year>.

[bib422] [RO88] C. H. Romine and J. M. Ortega. Parallel solution of triangular systems of equations. Parallel Computing, 6:109–114, 1988.

[bib423] [Rob75] M. O. Robin. Probabilistic algorithms. In J.Traub, editor, Algorithms and Complexity: New Directions and Recent Results, 21–39. Academic Press, San Diego, CA, <year>1975</year>.

[bib424] [Rob90] Y. Robert. The Impact of Vector and Parallel Architectures on Gaussian Elimination. John Wiley and Sons, New York, NY, <year>1990</year>.

[bib425] [Rom87] C. H. Romine. The parallel solution of triangular systems on a hypercube. In M. T. Heath, editor, Hypercube Multiprocessors 1987, 552–559. SIAM, Philadelphia, PA, <year>1987</year>.

[bib426] [Rou87] C. Roucairol. A parallel branch-and-bound algorithm for the quadratic assignment problem. Discrete Applied Mathematics, 18:211–225, 1987.

[bib427] [Rou91] C. Roucairol. Parallel branch-and-bound on shared-memory multiprocessors. In Proceedings of the Workshop On Parallel Computing of Discrete Optimization Problems, 1991.

[bib428] [RRRR96] K. A. Robbins, S. Robbins, K. R. Robbins, and S. Robbins. Practical UNIX Programming: A Guide to Concurrency, Communication, and Multithreading. Prentice Hall, <year>1996</year>.

[bib429] [RS90a] A. P. R. Alverson, D. Callahan, D. Cummings, B. Koblenz and B. Smith. The tera computer system. In International Conference on Supercomputing, 1–6, 1990.

[bib430] [RS90b] S. Ranka and S. Sahni. Hypercube Algorithms for Image Processing and Pattern Recognition. Springer-Verlag, New York, NY, <year>1990</year>.

[bib431] [RV87] J. H. Reif and L. G. Valiant. A logarithmic time sort for linear size networks. Journal of the ACM, 34(1):60–76, January 1987.

[bib432] [Ryt88] W. Rytter. Efficient parallel computations for dynamic programming. Theoretical Computer Science, 59:297–307, 1988.

[bib433] [RZ89] M.Reeve and S.E.Zenith, editors. Parallel Processing and Artificial Intelligence. Wiley, Chichester, UK, <year>1989</year>.

[bib434] [Saa86] Y. Saad. Communication complexity of the Gaussian elimination algorithm on multiprocessors. Linear Algebra and its Applications, 77:315–340, 1986.

[bib435] [SB77] H. Sullivan and T. R. Bashkow. A large scale, homogeneous, fully distributed parallel machine. In Proceedings of Fourth Symposium on Computer Architecture, 105–124, March 1977.

[bib436] [SBCV90] R. H. Saavedra-Barrera, D. E. Culler, and T. Von Eiken. Analysis of multithreaded architectures for parallel computing. Report UCB/CSD 90/569, University of California, Berkeley, Computer Science Division, Berkeley, CA, April 1990.

[bib437] [Sch80] J. T. Schwartz. Ultracomputers. ACM Transactions on Programming Languages and Systems, 2:484–521, October 1980.

[bib438] [Sed78] R. Sedgewick. Implementing quicksort programs. Communications of the ACM, 21(10):847–857, 1978.

[bib439] [Sei85] C. L. Seitz. The cosmic cube. Communications of the ACM, 28(1):22–33, 1985.

[bib440] [Sei89] S. R. Seidel. Circuit-switched vs. store-and-forward solutions to symmetric communication problems. In Proceedings of the Fourth Conference on Hypercubes, Concurrent Computers, and Applications, 253–255, 1989.

[bib441] [Sei92] C. L. Seitz. Mosaic C: An experimental fine-grain multicomputer. Technical report, California Institute of Technology, Pasadena, CA, 1992.

[bib442] [SG88] S. R. Seidel and W. L. George. Binsorting on hypercube with d-port communication. In Proceedings of the Third Conference on Hypercube Concurrent Computers, 1455–1461, January 1988.

[bib443] [SG91] X. -H. Sun and J. L. Gustafson. Toward a better parallel performance metric. Parallel Computing, 17:1093–1109, <year>December 1991</year>. Also available as Technical Report IS-5053, UC-32, Ames Laboratory, Iowa State University, Ames, IA.

[bib444] [Sha85] J. A. Sharp. Data-Flow Computing. Ellis Horwood, Chichester, UK, <year>1985</year>.

[bib445] [She59] D. L. Shell. A high-speed sorting procedure. Communications of the ACM, 2(7):30–32, July 1959.

[bib446] [SHG93] J. P. Singh, J. L. Hennessy, and A. Gupta. Scaling parallel programs for multiprocessors: Methodology and examples. IEEE Computer, 26(7):42–50, 1993.

[bib447] [Sie77] H. J. Siegel. The universality of various types of SIMD machine interconnection networks. In Proceedings of the 4th Annual Symposium on Computer Architecture , 23–25, 1977.

[bib448] [Sie85] H. J. Siegel. Interconnection Networks for Large-Scale Parallel Processing. D. C. Heath, Lexington, MA, <year>1985</year>.

[bib449] [SJ81] C. Savage and J. Jaja. Fast, efficient parallel algorithms for some graph problems. SIAM Journal of Computing, 10(4):682–690, November 1981.

[bib450] [SK89] W. Shu and L. V. Kale. A dynamic scheduling strategy for the chare-kernel system. In Proceedings of Supercomputing Conference, 389–398, 1989.

[bib451] [SK90] V. Saletore and L. V. Kale. Consistent linear speedup to a first solution in parallel state-space search. In Proceedings of the 1990 National Conference on Artificial Intelligence, 227–233, August 1990.

[bib452] [SKAT91a] V. Singh, V. Kumar, G. Agha, and C. Tomlinson. Efficient algorithms for parallel sorting on mesh multicomputers. International Journal of Parallel Programming, 20(2):95–131, 1991.

[bib453] [SKAT91b] V. Singh, V. Kumar, G. Agha, and C. Tomlinson. Scalability of parallel sorting on mesh multicomputers. International Journal of Parallel Programming, 20(2), 1991.

[bib454] [SM86] S. J. Stolfo and D. P. Miranker. The DADO production system machine. Journal of Parallel and Distributed Computing, 3:269–296, June 1986.

[bib455] [Smi84] D. R. Smith. Random trees and the analysis of branch and bound procedures. Journal of the ACM, 31(1), 1984.

[bib456] [SN90] X. -H. Sun and L. M. Ni. Another view of parallel speedup. In Supercomputing ’90 Proceedings, 324–333, 1990.

[bib457] [SN93] X. -H. Sun and L. M. Ni. Scalable problems and memory-bounded speedup. Journal of Parallel and Distributed Computing, 19:27–37, September 1993.

[bib458] [Sni82] M. Snir. On parallel search. In Proceedings of Principles of Distributed Computing, 242–253, 1982.

[bib459] [Sni85] M. Snir. On parallel searching. SIAM Journal of Computing, 14(3):688–708, August 1985.

[bib460] [Sny86] L. Snyder. Type architectures, shared-memory and the corollary of modest potential. Annual Review of Computer Science, 1:289–317, <year>1986</year>.

[bib461] [SOHL⁺96] M. Snir, S. W. Otto, S. Huss-Lederman, D. W. Walker, and J. Dongarra. MPI: The Complete Reference. MIT Press, Cambridge, MA, <year>1996</year>.

[bib462] [Sol77] M. Sollin. An algorithm attributed to Sollin. In S.Goodman and S.Hedetniemi, editors, Introduction to The Design and Analysis of Algorithms. McGraw-Hill, Cambridge, MA, <year>1977</year>.

[bib463] [SR91] X. -H. Sun and D. T. Rover. Scalability of parallel algorithm-machine combinations. Technical Report IS-5057, Ames Laboratory, Iowa State University, Ames, IA, 1991. Also published in IEEE Transactions on Parallel and Distributed Systems.

[bib464] [SS88] Y. Saad and M. H. Schultz. Topological properties of hypercubes. IEEE Transactions on Computers, 37:867–872, 1988.

[bib465] [SS89a] Y. Saad and M. H. Schultz. Data communication in hypercubes. Journal of Parallel and Distributed Computing, 6:115–135, 1989. Also available as Technical Report YALEU/DCS/RR-428 from the Department of Computer Science, Yale University, New Haven, CT.

[bib466] [SS89b] Y. Saad and M. H. Schultz. Data communication in parallel architectures. Parallel Computing, 11:131–150, 1989.

[bib467] [SS90] H. Shi and J. Schaeffer. Parallel sorting by regular sampling. Journal of Parallel and Distributed Computing, 14:361–372, 1990.

[bib468] [ST95] B. M. S. Tschvke, R. Lling. Solving the traveling salesman problem with a distributed branch-and-bound algorithm on a 1024 processor network. In Proceedings of the 9th International Parallel Processing Symposium, 182– 189, Santa Barbara, CA, April 1995.

[bib469] [Sto71] H. S. Stone. Parallel processing with the perfect shuffle. IEEE Transactions on Computers, C-20(2):153–161, 1971.

[bib470] [Sto93] H. S. Stone. High-Performance Computer Architecture: Third Edition. Addison-Wesley, Reading, MA, <year>1993</year>.

[bib471] [Sun95] SunSoft. Solaris multithreaded programming guide. SunSoft Press, Mountainview, CA, <year>1995</year>.

[bib472] [Sup91] Supercomputer Systems Division, Intel Corporation. Paragon XP/S Product Overview. Beaverton, OR, <year>1991</year>.

[bib473] [SV81] Y. Shiloach and U. Vishkin. Finding the maximum, merging and sorting in a parallel computation model. Journal of Algorithms, 88–102, 1981.

[bib474] [SV82] Y. Shiloach and U. Vishkin. An O (n² log n) parallel max-flow algorithm. Journal of Algorithms, 3:128–146, 1982.

[bib475] [SW87] Q. F. Stout and B. A. Wagar. Passing messages in link-bound hypercubes. In M.T.Heath, editor, Hypercube Multiprocessors 1987, 251–257. SIAM, Philadelphia, PA, <year>1987</year>.

[bib476] [Swa87] P. N. Swarztrauber. Multiprocessor FFTs. Parallel Computing, 5:197–210, 1987.

[bib477] [SZ96] X. -H. Sun and J. P. Zhu. Performance considerations of shared virtual memory machines. IEEE Trans. on Parallel and Distributed Systems, 6(11):1185– 1194, 1996.

[bib478] [Tab90] D. Tabak. Multiprocessors. Prentice-Hall, Englewood Cliffs, NJ, <year>1990</year>.

[bib479] [Tab91] D. Tabak. Advanced Multiprocessors. McGraw-Hill, New York, NY, <year>1991</year>.

[bib480] [Tak87] R. Take. An optimal routing method of all-to-all communication on hypercube networks. In 35th Information Processing Society of Japan, <year>1987</year>.

[bib481] [TEL95] D. M. Tullsen, S. Eggers, and H. M. Levy. Simultaneous multithreading: Maximizing on-chip parallelism. In Proceedings of the 22nd Annual International Symposium on Computer Architecture, 1995.

[bib482] [Ten90] S. Teng. Adaptive parallel algorithms for integral knapsack problems. Journal of Parallel and Distributed Computing, 8:400–406, 1990.

[bib483] [Thi90] Thinking Machines Corporation. The CM-2 Technical Summary. Cambridge, MA. <year>1990</year>.

[bib484] [Thi91] Thinking Machines Corporation. The CM-5 Technical Summary. Cambridge, MA. <year>1991</year>.

[bib485] [Tho83] C. D. Thompson. Fourier transforms in VLSI. IBM Journal of Research and Development, C-32(11):1047–1057, 1983.

[bib486] [Thr99] J. Throop. Standards: OpenMP: Shared-memory parallelism from the ashes. Computer, 32(5):108–109, May 1999.

[bib487] [Tic88] W. F. Tichy. Parallel matrix multiplication on the connection machine. Technical Report RIACS TR 88.41, Research Institute for Advanced Computer Science, NASA Ames Research Center, Moffet Field, CA, 1988.

[bib488] [TK77] C. D. Thompson and H. T. Kung. Sorting on a mesh-connected parallel computer. Communications of the ACM, 21:263–271, 1977.

[bib489] [TL90] Z. Tang and G. -J. Li. Optimal granularity of grid iteration problems. In Proceedings of the 1990 International Conference on Parallel Processing, I111–I118, 1990.

[bib490] [Tul96] D. M. Tullsen. Simultaneous multithreading. Ph.D. Thesis, University of Washington, Seattle, WA, <year>1996</year>.

[bib491] [TV85] R. E. Tarjan and U. Vishkin. An efficient parallel biconnectivity algorithm. SIAM Journal on Computing, 14(4):862–874, November 1985.

[bib492] [TY96] J. -Y. Tsai and P. -C. Yew. The superthreaded architecture: Thread pipelining with run-time data dependence checking and control speculation. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 35–46, 1996.

[bib493] [Upf84] E. Upfal. A probabilistic relation between desirable and feasible models of parallel computation. In Proceedings of the 16th ACM Conference on Theory of Computing, 258–265, <year>1984</year>.

[bib494] [UW84] E. Upfal and A. Widgerson. How to share memory in a distributed system. In Proceedings of the 25th Annual Symposium on the Foundation of Computer Science, 171–180, 1984.

[bib495] [Val75] L. G. Valiant. Parallelism in comparison problems. SIAM Journal of Computing, 4(3):348–355, September 1975.

[bib496] [Val82] L. G. Valiant. A scheme for fast parallel communication. SIAM Journal on Computing, 11:350–361, 1982.

[bib497] [Val90a] L. G. Valiant. A bridging model for parallel computation. Communications of the ACM, 33(8), 1990.

[bib498] [Val90b] L. G. Valiant. General purpose parallel architectures. Handbook of Theoretical Computer Science, <year>1990</year>.

[bib499] [Vav89] S. A. Vavasis. Gaussian elimination with pivoting is P-complete. SIAM Journal on Discrete Mathematics, 2:413–423, 1989.

[bib500] [VB81] L. G. Valiant and G. J. Brebner. Universal schemes for parallel communication. In Proceedings of the 13th ACM Symposium on Theory of Computation, 263–277, 1981.

[bib501] [VC89] F. A. Van-Catledge. Towards a general model for evaluating the relative performance of computer systems. International Journal of Supercomputer Applications, 3(2):100–108, 1989.

[bib502] [Vor86] O. Vornberger. Implementing branch-and-bound in a ring of processors. Technical Report 29, University of Paderborn, Germany, 1986.

[bib503] [Vor87a] O. Vornberger. The personal supercomputer: A network of transputers. In Proceedings of the 1987 International Conference on Supercomputing, 1987.

[bib504] [Vor87b] O. Vornberger. Load balancing in a network of transputers. In Proceedings of the Second International Workshop on Distributed Parallel Algorithms, 1987.

[bib505] [VS86] J. S. Vitter and R. A. Simmons. New classes for parallel complexity: A study of unification and other complete problems for P. IEEE Transactions on Computers, May 1986.

[bib506] [VSBR83] L. G. Valiant, S. Skyum, S. Berkowitz, and C. Rackoff. Fast parallel computation of polynomials using few processors. SIAM Journal of Computing, 12(4):641–644, 1983.

[bib507] [WA98] B. Wilkinson and C. M. Allen. Parallel Programming: Techniques and Applications Using Networked Workstations and Parallel Computers. Prentice Hall, <year>1998</year>.

[bib508] [WA99] B. Wilkinson and M. Allen. Parallel Programming. Prentice-Hall, NJ, <year>1999</year>.

[bib509] [Wag87] B. A. Wagar. Hyperquicksort: A fast sorting algorithm for hypercubes. In Proceedings of the Second Conference on Hypercube Multiprocessors, 292– 299, 1987.

[bib510] [Wal91] Y. Wallach. Parallel Processing and Ada. Prentice-Hall, Englewood Cliffs, NJ, <year>1991</year>.

[bib511] [WB72] W. A. Wulf and C. G. Bell. C.mmp—a multimicroprocessor. In Proceedings of AFIPS Conference, 765–777, 1972.

[bib512] [Wei97] B. Weissman. Active threads manual. Technical Report TR-97-037, International Computer Science Institute, Berkeley, CA 94704, <year>1997</year>.

[bib513] [WF80] C. L. Wu and T. Y. Feng. On a class of multistage interconnection networks. IEEE Transactions on Computers, 669–702, August 1980.

[bib514] [WF84] C. L. Wu and T. Y. Feng. Interconnection Networks for Parallel and Distributed Processing. IEEE Computer Society Press, Washington, DC, <year>1984</year>.

[bib515] [WI89] K. Wada and N. Ichiyoshi. A distributed shortest path algorithm and its mapping on the Multi-PSI. In Proceedings of International Conference of Parallel Processing, 1989.

[bib516] [Wil86] H. S. Wilf. Algorithms and Complexity. Prentice-Hall, NJ, <year>1986</year>.

[bib517] [Wil95] G. V. Wilson. Practical Parallel Programming. MIT Press, Cambridge, MA, <year>1995</year>.

[bib518] [Wil00] A. Williams. Windows 2000 Systems Programming Black Book. The Coriolis Group, <year>2000</year>.

[bib519] [Win77] S. Winograd. A new method for computing DFT. In IEEE International Conference on Acoustics, Speech and Signal Processing, 366–368, 1977.

[bib520] [WL88] B. W. Wah and G. -J. Li. Systolic processing for dynamic programming problems. Circuits, Systems, and Signal Processing, 7(2):119–149, 1988.

[bib521] [WLY84] B. W. Wah, G. -J. Li, and C. F. Yu. The status of MANIP—a multicomputer architecture for solving combinatorial extremum-search problems. In Proceedings of 11th Annual International Symposium on Computer Architecture, 56–63, 1984.

[bib522] [WM84] B. W. Wah and Y. W. E. Ma. MANIP—a multicomputer architecture for solving combinatorial extremum-search problems. IEEE Transactions on Computers, C–33, May 1984.

[bib523] [Woo86] J.V.Woods, editor. Fifth Generation Computer Architectures. North-Holland, Amsterdam, The Netherlands, <year>1986</year>.

[bib524] [Wor88] P. H. Worley. Information Requirements and the Implications for Parallel Computation. Ph.D. Thesis, Stanford University, Department of Computer Science, Palo Alto, CA, <year>1988</year>.

[bib525] [Wor90] P. H. Worley. The effect of time constraints on scaled speedup. SIAM Journal on Scientific and Statistical Computing, 11(5):838–858, 1990.

[bib526] [Wor91] P. H. Worley. Limits on parallelism in the numerical solution of linear PDEs. SIAM Journal on Scientific and Statistical Computing, 12:1–35, January 1991.

[bib527] [WS88] Y. Won and S. Sahni. A balanced bin sort for hypercube multiprocessors. Journal of Supercomputing, 2:435–448, 1988.

[bib528] [WS89] J. Woo and S. Sahni. Hypercube computing: Connected components. Journal of Supercomputing, 3:209–234, 1989.

[bib529] [WS91] J. Woo and S. Sahni. Computing biconnected components on a hypercube. Journal of Supercomputing, June 1991. Also available as Technical Report TR 89-7 from the Department of Computer Science, University of Minnesota, Minneapolis, MN.

[bib530] [WY85] B. W. Wah and C. F. Yu. Stochastic modeling of branch-and-bound algorithms with best-first search. IEEE Transactions on Software Engineering, SE-11, September 1985.

[bib531] [Zho89] X. Zhou. Bridging the gap between Amdahl’s law and Sandia laboratory’s result. Communications of the ACM, 32(8):1014–5, 1989.

[bib532] [Zom96] A.Zomaya, editor. Parallel and Distributed Computing Handbook. McGraw-Hill, <year>1996</year>.

[bib533] [ZRV89] J. R. Zorbas, D. J. Reble, and R. E. VanKooten. Measuring the scalability of parallel computer systems. In Supercomputing ’89 Proceedings, 832–841, <year>1989</year>.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Bibliography

Create new playlist

Sign In

Sign Up

Bibliography

Table of Contents for
Bibliography