000 03324nam a22005775i 4500
001 978-3-031-19067-4
003 DE-He213
005 20240730163609.0
007 cr nn 008mamaa
008 221125s2023 sz | s |||| 0|eng d
020 _a9783031190674
_9978-3-031-19067-4
024 7 _a10.1007/978-3-031-19067-4
_2doi
050 4 _aQA76.9.A43
072 7 _aUMB
_2bicssc
072 7 _aCOM051300
_2bisacsh
072 7 _aUMB
_2thema
082 0 4 _a518.1
_223
100 1 _aJoshi, Gauri.
_eauthor.
_4aut
_4http://id.loc.gov/vocabulary/relators/aut
_979416
245 1 0 _aOptimization Algorithms for Distributed Machine Learning
_h[electronic resource] /
_cby Gauri Joshi.
250 _a1st ed. 2023.
264 1 _aCham :
_bSpringer International Publishing :
_bImprint: Springer,
_c2023.
300 _aXIII, 127 p. 40 illus., 38 illus. in color.
_bonline resource.
336 _atext
_btxt
_2rdacontent
337 _acomputer
_bc
_2rdamedia
338 _aonline resource
_bcr
_2rdacarrier
347 _atext file
_bPDF
_2rda
490 1 _aSynthesis Lectures on Learning, Networks, and Algorithms,
_x2690-4314
505 0 _aDistributed Optimization in Machine Learning -- Calculus, Probability and Order Statistics Review -- Convergence of SGD and Variance-Reduced Variants -- Synchronous SGD and Straggler-Resilient Variants -- Asynchronous SGD and Staleness-Reduced Variants -- Local-update and Overlap SGD -- Quantized and Sparsiļ¬ed Distributed SGD -- Decentralized SGD and its Variants.
520 _aThis book discusses state-of-the-art stochastic optimization algorithms for distributed machine learning and analyzes their convergence speed. The book first introduces stochastic gradient descent (SGD) and its distributed version, synchronous SGD, where the task of computing gradients is divided across several worker nodes. The author discusses several algorithms that improve the scalability and communication efficiency of synchronous SGD, such as asynchronous SGD, local-update SGD, quantized and sparsified SGD, and decentralized SGD. For each of these algorithms, the book analyzes its error versus iterations convergence, and the runtime spent per iteration. The author shows that each of these strategies to reduce communication or synchronization delays encounters a fundamental trade-off between error and runtime.
650 0 _aAlgorithms.
_93390
650 0 _aMachine learning.
_91831
650 0 _aArtificial intelligence.
_93407
650 0 _aDistribution (Probability theory).
_910767
650 0 _aComputer science.
_99832
650 1 4 _aAlgorithms.
_93390
650 2 4 _aMachine Learning.
_91831
650 2 4 _aDesign and Analysis of Algorithms.
_931835
650 2 4 _aArtificial Intelligence.
_93407
650 2 4 _aDistribution Theory.
_979417
650 2 4 _aComputer Science.
_99832
710 2 _aSpringerLink (Online service)
_979418
773 0 _tSpringer Nature eBook
776 0 8 _iPrinted edition:
_z9783031190667
776 0 8 _iPrinted edition:
_z9783031190681
776 0 8 _iPrinted edition:
_z9783031190698
830 0 _aSynthesis Lectures on Learning, Networks, and Algorithms,
_x2690-4314
_979419
856 4 0 _uhttps://doi.org/10.1007/978-3-031-19067-4
912 _aZDB-2-SXSC
942 _cEBK
999 _c84778
_d84778