Add new posts about eddl

2026-06-13 08:08:10 -07:00 · 2021-10-13 18:34:16 -04:00
parent 1595ec54ad
commit c2f2b702c5
16 changed files with 621 additions and 40 deletions
@@ -0,0 +1,77 @@
+---
+layout: post
+title:  "EDDL: How do we train neural networks on limited edge devices - PART 1"
+date:   2021-10-13 16:53:20 -0400
+categories: Research
+---
+This post introduces our previous milestone in project "Edge trainer", as the paper "EDDL: A Distributed Deep Learning System for Resource-limited Edge Computing Environment." was published.
+As the first part of the introductions, I focus only on the motivation and summary of our works.
+More details in design and implementation can be found in late posts.
+
+<img src="/static/2021-10/edgelearn-1.png" height="250">
+
+## Why do we need training on edge?
+
+Cloud is not trustworthy anymore. More and more facts supports that breach on cloud happens frequently than before.
+Nowadays, with more generated personal sensitive data has been uploaded to the cloud center, tech company know better to someones than user themselves.
+  
+Researchers, no matter in industry on academia, are working in a way that still learning from users' data but also keeping raw sensitive data under users' control.
+Many publications already showed feasibility of only sharing after-trained model instead of raw data.
+One recent popular study on this is google's [federated learning](https://ai.googleblog.com/2017/04/federated-learning-collaborative.html).
+  
+During investigated this problem, we found that let end user train their own data is safe, but sacrifice efficiency.
+Since one end device has limited resources, training time and power consumption can be disappointing.
+We believe there must have a leverage between privacy and efficiency in some target scenarios.
+
+Fortunately, we observed that users who belongs to the same campus, plant, firm and community always share similar interests.
+Therefore, these co-located users have similar demands in using AI-involved routines.
+Also, co-located users are easily targeted by same type of threats, such as ransomware to financial practitioners.
+
+Think about this, sending features of a new malware app to cloud services in order to train a neural networks used by antivirus program.
+This process may takes long time and small amount of samples may not be recognized by the global neural networks model.
+With a customized local model trained and deployed on the edge can successfully counter the problem.
+With edge training as a supplement of cloud training can achieve better response time and let the whole system more flexible.
+
+## Why training on edge is hard?
+
+Since all co-located users' device can be used for an edge training, issues and challenges occur as deploying this distributed system.
+
+The first challenge is **struggling workers**.
+Training devices are heterogeneity, from limited IoT camera to high-end media center with powerful GPU.
+They are not designed to do machine learnings.
+So, a good edge-based distributed learning framework must can handle variety speeds in training tasks.
+
+The second challenge is how to **scale up** clusters.
+In a campus, thousands and more devices may contribute computing resources to the same training tasks.
+However, these devices may located in far not matter in physical or in network topology. 
+How can we well use them well, without struggled with endless transmission time remains a challenge.
+
+The third issue is frequently **joining and exiting** of devices.
+We can't rely on each devices to faithfully working on training tasks rather than their original workload.
+Smartly schedule work balance and handle join/exit issues also need under consideration.
+
+## Our proposal
+
+- Dynamic training data distribution and runtime profiler
+
+    We design a dynamic training data distribution mechanism that helps to both the first and the third challenges.
+    Preprocessing data can be transmitted without leakage of raw sensitive information. 
+    This can helps with struggling workers who can train small batches in order to upload parameters with a similar training time.
+    Also, for extremely slow devices, join and exit of devices cases, dynamic data distribution and profiler can helps with keep global training parameters from polluted and staleness.
+
+    To counter heterogeneity's, more approaches were applied in our later research.
+    More details were introduced to runtime profiler in the later works. 
+
+- Asynchronous and synchronous aggregation enabled
+
+    In our findings, asynchronous and synchronous parameter update have their pros and cons. 
+    Keeping sync all the time leads struggling worker issue unsolvable.
+    However, async's harm to accuracy and convergence time also need attentions.
+    To carefully chose between these two update policies at the runtime is what we proposed to make use of their own advantages.
+
+- Leader role splitting
+
+    The idea is to let worker devices with higher bandwidth taking leader role during training.
+    Parameter updating does not require much computation but only need bandwidth. 
+    Devices with sufficient bandwidth can also work as virtual leader devices.
+    This approach helps with minimize physical devices we used and more leaders can further scale up workers limits.
@@ -60,6 +60,8 @@
      
        
      
+        
+      
    </nav>
  </div>
 </header>
@@ -104,6 +106,8 @@
  <div class="col-box-title">Newest Posts</div>
  <ul class="post-list">
    
+      <li><a class="post-link" href="/archivers/eddl-how-do-we-train-on-limited-edge-devices">EDDL: How do we train neural networks on limited edge devices - PART 1</a></li>
+    
      <li><a class="post-link" href="/archivers/generate-word-cloud-with-chinese-fenci">Generate Word Cloud Figures with Chinese-Tokenization and WordCloud python libraries</a></li>
    
      <li><a class="post-link" href="/archivers/intro-xv6">Xv6 introduction</a></li>
@@ -112,8 +116,6 @@
    
      <li><a class="post-link" href="/archivers/charles-is-not-a-good-tool">Using charles proxy to monitor mobile SSL traffics</a></li>
    
-      <li><a class="post-link" href="/archivers/hello">Stop Talking is the worst title of one blog</a></li>
-    
  </ul>
 </div>

@@ -60,6 +60,8 @@
      
        
      
+        
+      
    </nav>
  </div>
 </header>
@@ -216,6 +218,8 @@ Niagara Falls, NY, USA, 2017.</p>
  <div class="col-box-title">Newest Posts</div>
  <ul class="post-list">
    
+      <li><a class="post-link" href="/archivers/eddl-how-do-we-train-on-limited-edge-devices">EDDL: How do we train neural networks on limited edge devices - PART 1</a></li>
+    
      <li><a class="post-link" href="/archivers/generate-word-cloud-with-chinese-fenci">Generate Word Cloud Figures with Chinese-Tokenization and WordCloud python libraries</a></li>
    
      <li><a class="post-link" href="/archivers/intro-xv6">Xv6 introduction</a></li>
@@ -224,8 +228,6 @@ Niagara Falls, NY, USA, 2017.</p>
    
      <li><a class="post-link" href="/archivers/charles-is-not-a-good-tool">Using charles proxy to monitor mobile SSL traffics</a></li>
    
-      <li><a class="post-link" href="/archivers/hello">Stop Talking is the worst title of one blog</a></li>
-    
  </ul>
 </div>

@@ -59,6 +59,8 @@
      
        
      
+        
+      
    </nav>
  </div>
 </header>
@@ -146,6 +148,8 @@ You also need to save charles Root Certificate, it also contains in the same men
  <div class="col-box-title">Newest Posts</div>
  <ul class="post-list">
    
+      <li><a class="post-link" href="/archivers/eddl-how-do-we-train-on-limited-edge-devices">EDDL: How do we train neural networks on limited edge devices - PART 1</a></li>
+    
      <li><a class="post-link" href="/archivers/generate-word-cloud-with-chinese-fenci">Generate Word Cloud Figures with Chinese-Tokenization and WordCloud python libraries</a></li>
    
      <li><a class="post-link" href="/archivers/intro-xv6">Xv6 introduction</a></li>
@@ -154,8 +158,6 @@ You also need to save charles Root Certificate, it also contains in the same men
    
      <li><a class="post-link" href="/archivers/charles-is-not-a-good-tool">Using charles proxy to monitor mobile SSL traffics</a></li>
    
-      <li><a class="post-link" href="/archivers/hello">Stop Talking is the worst title of one blog</a></li>
-    
  </ul>
 </div>

@@ -0,0 +1,229 @@
+<!DOCTYPE html>
+<html>
+
+  <head>
+  <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
+  <meta name="viewport" content="width=device-width, initial-scale=1">
+
+  <title>EDDL: How do we train neural networks on limited edge devices - PART 1 « Stop Talking, Start Doing</title>
+  <meta name="description" content="This post introduces our previous milestone in project “Edge trainer”, as the paper “EDDL: A Distributed Deep Learning System for Resource-limited Edge Compu...">
+
+  <link rel="stylesheet" href="/css/main.css">
+  <link rel="stylesheet" href="/css/timeline.css">
+  <link rel="canonical" href="https://codersherlock.github.com//archivers/eddl-how-do-we-train-on-limited-edge-devices">
+  <link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Tangerine">
+  <link rel="alternate" type="application/rss+xml" title="Stop Talking, Start Doing" href="https://codersherlock.github.com//feed.xml" />
+  <script>
+  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
+        (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
+    m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
+          })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');
+
+  ga('create', 'UA-82637164-1', 'auto');
+    ga('send', 'pageview');
+
+  </script>
+  <script async src="//pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script>
+  <script>
+    (adsbygoogle = window.adsbygoogle || []).push({
+      google_ad_client: "ca-pub-6651321038908478",
+      enable_page_level_ads: true
+    });
+  </script>
+</head>
+
+
+  <body>
+
+    <header class="header">
+  <div class="wrapper">
+    <a class="site-title" href="/">Stop Talking, Start Doing</a>
+    <nav class="site-nav">
+      
+        
+      
+        
+        <a class="page-link" href="/about/">About</a>
+        
+      
+        
+        <a class="page-link" href="/category/">Category</a>
+        
+      
+        
+      
+        
+      
+        
+      
+        
+      
+        
+      
+    </nav>
+  </div>
+</header>
+
+    <div class="page-content">
+      <div class="wrapper">
+        <div class="col-main">
+          <div class="post">
+
+  <header class="post-header">
+    <h1 class="post-title">EDDL: How do we train neural networks on limited edge devices - PART 1</h1>
+    <p class="post-meta">Oct 13, 2021</p>
+  </header>
+
+  <article class="post-content">
+    <p>This post introduces our previous milestone in project “Edge trainer”, as the paper “EDDL: A Distributed Deep Learning System for Resource-limited Edge Computing Environment.” was published.
+As the first part of the introductions, I focus only on the motivation and summary of our works.
+More details in design and implementation can be found in late posts.</p>
+
+<p><img src="/static/2021-10/edgelearn-1.png" height="250" /></p>
+
+<h2 id="why-do-we-need-training-on-edge">Why do we need training on edge?</h2>
+
+<p>Cloud is not trustworthy anymore. More and more facts supports that breach on cloud happens frequently than before.
+Nowadays, with more generated personal sensitive data has been uploaded to the cloud center, tech company know better to someones than user themselves.</p>
+
+<p>Researchers, no matter in industry on academia, are working in a way that still learning from users’ data but also keeping raw sensitive data under users’ control.
+Many publications already showed feasibility of only sharing after-trained model instead of raw data.
+One recent popular study on this is google’s <a href="https://ai.googleblog.com/2017/04/federated-learning-collaborative.html">federated learning</a>.</p>
+
+<p>During investigated this problem, we found that let end user train their own data is safe, but sacrifice efficiency.
+Since one end device has limited resources, training time and power consumption can be disappointing.
+We believe there must have a leverage between privacy and efficiency in some target scenarios.</p>
+
+<p>Fortunately, we observed that users who belongs to the same campus, plant, firm and community always share similar interests.
+Therefore, these co-located users have similar demands in using AI-involved routines.
+Also, co-located users are easily targeted by same type of threats, such as ransomware to financial practitioners.</p>
+
+<p>Think about this, sending features of a new malware app to cloud services in order to train a neural networks used by antivirus program.
+This process may takes long time and small amount of samples may not be recognized by the global neural networks model.
+With a customized local model trained and deployed on the edge can successfully counter the problem.
+With edge training as a supplement of cloud training can achieve better response time and let the whole system more flexible.</p>
+
+<h2 id="why-training-on-edge-is-hard">Why training on edge is hard?</h2>
+
+<p>Since all co-located users’ device can be used for an edge training, issues and challenges occur as deploying this distributed system.</p>
+
+<p>The first challenge is <strong>struggling workers</strong>.
+Training devices are heterogeneity, from limited IoT camera to high-end media center with powerful GPU.
+They are not designed to do machine learnings.
+So, a good edge-based distributed learning framework must can handle variety speeds in training tasks.</p>
+
+<p>The second challenge is how to <strong>scale up</strong> clusters.
+In a campus, thousands and more devices may contribute computing resources to the same training tasks.
+However, these devices may located in far not matter in physical or in network topology. 
+How can we well use them well, without struggled with endless transmission time remains a challenge.</p>
+
+<p>The third issue is frequently <strong>joining and exiting</strong> of devices.
+We can’t rely on each devices to faithfully working on training tasks rather than their original workload.
+Smartly schedule work balance and handle join/exit issues also need under consideration.</p>
+
+<h2 id="our-proposal">Our proposal</h2>
+
+<ul>
+  <li>
+    <p>Dynamic training data distribution and runtime profiler</p>
+
+    <p>We design a dynamic training data distribution mechanism that helps to both the first and the third challenges.
+  Preprocessing data can be transmitted without leakage of raw sensitive information. 
+  This can helps with struggling workers who can train small batches in order to upload parameters with a similar training time.
+  Also, for extremely slow devices, join and exit of devices cases, dynamic data distribution and profiler can helps with keep global training parameters from polluted and staleness.</p>
+
+    <p>To counter heterogeneity’s, more approaches were applied in our later research.
+  More details were introduced to runtime profiler in the later works.</p>
+  </li>
+  <li>
+    <p>Asynchronous and synchronous aggregation enabled</p>
+
+    <p>In our findings, asynchronous and synchronous parameter update have their pros and cons. 
+  Keeping sync all the time leads struggling worker issue unsolvable.
+  However, async’s harm to accuracy and convergence time also need attentions.
+  To carefully chose between these two update policies at the runtime is what we proposed to make use of their own advantages.</p>
+  </li>
+  <li>
+    <p>Leader role splitting</p>
+
+    <p>The idea is to let worker devices with higher bandwidth taking leader role during training.
+  Parameter updating does not require much computation but only need bandwidth. 
+  Devices with sufficient bandwidth can also work as virtual leader devices.
+  This approach helps with minimize physical devices we used and more leaders can further scale up workers limits.</p>
+  </li>
+</ul>
+
+  </article>
+  
+  
+
+<div class="post-comments">
+  <div id="disqus_thread"></div>
+  <script type="text/javascript">
+      var disqus_shortname = 'codersherlockblog'; // required: replace example with your forum shortname
+      (function() {
+          var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
+          dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js';
+          (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
+      })();
+  </script>
+</div>
+
+
+
+
+</div>
+
+        </div>
+        <div class="col-second">
+          <div class="col-box col-box-author">
+  <img class="avatar" src="/static/avatar.jpg" alt="Pengzhan Hao">
+  <div class="col-box-title name">Pengzhan Hao</div>
+  <p></p>
+  <p class="contact">
+    
+    <a href="https://github.com/codersherlock">GitHub</a>
+    
+    
+    
+    <a href="mailto:haopengzhan@gmail.com">Email</a>
+    
+  </p>
+</div>
+
+<div class="col-box">
+  <div class="col-box-title">Newest Posts</div>
+  <ul class="post-list">
+    
+      <li><a class="post-link" href="/archivers/eddl-how-do-we-train-on-limited-edge-devices">EDDL: How do we train neural networks on limited edge devices - PART 1</a></li>
+    
+      <li><a class="post-link" href="/archivers/generate-word-cloud-with-chinese-fenci">Generate Word Cloud Figures with Chinese-Tokenization and WordCloud python libraries</a></li>
+    
+      <li><a class="post-link" href="/archivers/intro-xv6">Xv6 introduction</a></li>
+    
+      <li><a class="post-link" href="/archivers/some-of-my-previews-exper-work">Some of my previews experiment works: 2016</a></li>
+    
+      <li><a class="post-link" href="/archivers/charles-is-not-a-good-tool">Using charles proxy to monitor mobile SSL traffics</a></li>
+    
+  </ul>
+</div>
+
+<div class="col-box post-toc hide">
+  <div class="col-box-title">Indexes</div>
+</div>
+        </div>
+      </div>
+    </div>
+
+    <footer class="footer">
+<div class="wrapper">
+&copy; 2016 Pengzhan Hao
+</div>
+</footer>
+
+<script src="/js/easybook.js"></script>
+
+  </body>
+
+</html>
@@ -59,6 +59,8 @@
      
        
      
+        
+      
    </nav>
  </div>
 </header>
@@ -277,6 +279,8 @@ If your written language is based on latin alphabet(or other language has space
  <div class="col-box-title">Newest Posts</div>
  <ul class="post-list">
    
+      <li><a class="post-link" href="/archivers/eddl-how-do-we-train-on-limited-edge-devices">EDDL: How do we train neural networks on limited edge devices - PART 1</a></li>
+    
      <li><a class="post-link" href="/archivers/generate-word-cloud-with-chinese-fenci">Generate Word Cloud Figures with Chinese-Tokenization and WordCloud python libraries</a></li>
    
      <li><a class="post-link" href="/archivers/intro-xv6">Xv6 introduction</a></li>
@@ -285,8 +289,6 @@ If your written language is based on latin alphabet(or other language has space
    
      <li><a class="post-link" href="/archivers/charles-is-not-a-good-tool">Using charles proxy to monitor mobile SSL traffics</a></li>
    
-      <li><a class="post-link" href="/archivers/hello">Stop Talking is the worst title of one blog</a></li>
-    
  </ul>
 </div>

@@ -59,6 +59,8 @@
      
        
      
+        
+      
    </nav>
  </div>
 </header>
@@ -118,6 +120,8 @@
  <div class="col-box-title">Newest Posts</div>
  <ul class="post-list">
    
+      <li><a class="post-link" href="/archivers/eddl-how-do-we-train-on-limited-edge-devices">EDDL: How do we train neural networks on limited edge devices - PART 1</a></li>
+    
      <li><a class="post-link" href="/archivers/generate-word-cloud-with-chinese-fenci">Generate Word Cloud Figures with Chinese-Tokenization and WordCloud python libraries</a></li>
    
      <li><a class="post-link" href="/archivers/intro-xv6">Xv6 introduction</a></li>
@@ -126,8 +130,6 @@
    
      <li><a class="post-link" href="/archivers/charles-is-not-a-good-tool">Using charles proxy to monitor mobile SSL traffics</a></li>
    
-      <li><a class="post-link" href="/archivers/hello">Stop Talking is the worst title of one blog</a></li>
-    
  </ul>
 </div>

@@ -59,6 +59,8 @@
      
        
      
+        
+      
    </nav>
  </div>
 </header>
@@ -192,6 +194,8 @@ Using ssh may connect to different physical devices under same domain name, this
  <div class="col-box-title">Newest Posts</div>
  <ul class="post-list">
    
+      <li><a class="post-link" href="/archivers/eddl-how-do-we-train-on-limited-edge-devices">EDDL: How do we train neural networks on limited edge devices - PART 1</a></li>
+    
      <li><a class="post-link" href="/archivers/generate-word-cloud-with-chinese-fenci">Generate Word Cloud Figures with Chinese-Tokenization and WordCloud python libraries</a></li>
    
      <li><a class="post-link" href="/archivers/intro-xv6">Xv6 introduction</a></li>
@@ -200,8 +204,6 @@ Using ssh may connect to different physical devices under same domain name, this
    
      <li><a class="post-link" href="/archivers/charles-is-not-a-good-tool">Using charles proxy to monitor mobile SSL traffics</a></li>
    
-      <li><a class="post-link" href="/archivers/hello">Stop Talking is the worst title of one blog</a></li>
-    
  </ul>
 </div>

@@ -59,6 +59,8 @@
      
        
      
+        
+      
    </nav>
  </div>
 </header>
@@ -227,6 +229,8 @@ su
  <div class="col-box-title">Newest Posts</div>
  <ul class="post-list">
    
+      <li><a class="post-link" href="/archivers/eddl-how-do-we-train-on-limited-edge-devices">EDDL: How do we train neural networks on limited edge devices - PART 1</a></li>
+    
      <li><a class="post-link" href="/archivers/generate-word-cloud-with-chinese-fenci">Generate Word Cloud Figures with Chinese-Tokenization and WordCloud python libraries</a></li>
    
      <li><a class="post-link" href="/archivers/intro-xv6">Xv6 introduction</a></li>
@@ -235,8 +239,6 @@ su
    
      <li><a class="post-link" href="/archivers/charles-is-not-a-good-tool">Using charles proxy to monitor mobile SSL traffics</a></li>
    
-      <li><a class="post-link" href="/archivers/hello">Stop Talking is the worst title of one blog</a></li>
-    
  </ul>
 </div>

@@ -60,6 +60,8 @@
      
        
      
+        
+      
    </nav>
  </div>
 </header>
@@ -95,6 +97,8 @@
 <h2 class="category" id="Research">RESEARCH</h2>
 <ul>

+<li><span>Oct 13</span> » <a href="/archivers/eddl-how-do-we-train-on-limited-edge-devices">EDDL: How do we train neural networks on limited edge devices - PART 1</a></li>
+
 <li><span>Oct 28</span> » <a href="/archivers/some-of-my-previews-exper-work">Some of my previews experiment works: 2016</a></li>

 </ul>
@@ -141,6 +145,8 @@
  <div class="col-box-title">Newest Posts</div>
  <ul class="post-list">
    
+      <li><a class="post-link" href="/archivers/eddl-how-do-we-train-on-limited-edge-devices">EDDL: How do we train neural networks on limited edge devices - PART 1</a></li>
+    
      <li><a class="post-link" href="/archivers/generate-word-cloud-with-chinese-fenci">Generate Word Cloud Figures with Chinese-Tokenization and WordCloud python libraries</a></li>
    
      <li><a class="post-link" href="/archivers/intro-xv6">Xv6 introduction</a></li>
@@ -149,8 +155,6 @@
    
      <li><a class="post-link" href="/archivers/charles-is-not-a-good-tool">Using charles proxy to monitor mobile SSL traffics</a></li>
    
-      <li><a class="post-link" href="/archivers/hello">Stop Talking is the worst title of one blog</a></li>
-    
  </ul>
 </div>

@@ -6,10 +6,99 @@
 </description>
    <link>https://codersherlock.github.com//</link>
    <atom:link href="https://codersherlock.github.com//feed.xml" rel="self" type="application/rss+xml"/>
-    <pubDate>Tue, 12 Oct 2021 18:31:37 -0400</pubDate>
-    <lastBuildDate>Tue, 12 Oct 2021 18:31:37 -0400</lastBuildDate>
+    <pubDate>Wed, 13 Oct 2021 18:33:50 -0400</pubDate>
+    <lastBuildDate>Wed, 13 Oct 2021 18:33:50 -0400</lastBuildDate>
    <generator>Jekyll v4.1.1</generator>
    
+      <item>
+        <title>EDDL: How do we train neural networks on limited edge devices - PART 1</title>
+        <description>&lt;p&gt;This post introduces our previous milestone in project “Edge trainer”, as the paper “EDDL: A Distributed Deep Learning System for Resource-limited Edge Computing Environment.” was published.
+As the first part of the introductions, I focus only on the motivation and summary of our works.
+More details in design and implementation can be found in late posts.&lt;/p&gt;
+
+&lt;p&gt;&lt;img src=&quot;/static/2021-10/edgelearn-1.png&quot; height=&quot;250&quot; /&gt;&lt;/p&gt;
+
+&lt;h2 id=&quot;why-do-we-need-training-on-edge&quot;&gt;Why do we need training on edge?&lt;/h2&gt;
+
+&lt;p&gt;Cloud is not trustworthy anymore. More and more facts supports that breach on cloud happens frequently than before.
+Nowadays, with more generated personal sensitive data has been uploaded to the cloud center, tech company know better to someones than user themselves.&lt;/p&gt;
+
+&lt;p&gt;Researchers, no matter in industry on academia, are working in a way that still learning from users’ data but also keeping raw sensitive data under users’ control.
+Many publications already showed feasibility of only sharing after-trained model instead of raw data.
+One recent popular study on this is google’s &lt;a href=&quot;https://ai.googleblog.com/2017/04/federated-learning-collaborative.html&quot;&gt;federated learning&lt;/a&gt;.&lt;/p&gt;
+
+&lt;p&gt;During investigated this problem, we found that let end user train their own data is safe, but sacrifice efficiency.
+Since one end device has limited resources, training time and power consumption can be disappointing.
+We believe there must have a leverage between privacy and efficiency in some target scenarios.&lt;/p&gt;
+
+&lt;p&gt;Fortunately, we observed that users who belongs to the same campus, plant, firm and community always share similar interests.
+Therefore, these co-located users have similar demands in using AI-involved routines.
+Also, co-located users are easily targeted by same type of threats, such as ransomware to financial practitioners.&lt;/p&gt;
+
+&lt;p&gt;Think about this, sending features of a new malware app to cloud services in order to train a neural networks used by antivirus program.
+This process may takes long time and small amount of samples may not be recognized by the global neural networks model.
+With a customized local model trained and deployed on the edge can successfully counter the problem.
+With edge training as a supplement of cloud training can achieve better response time and let the whole system more flexible.&lt;/p&gt;
+
+&lt;h2 id=&quot;why-training-on-edge-is-hard&quot;&gt;Why training on edge is hard?&lt;/h2&gt;
+
+&lt;p&gt;Since all co-located users’ device can be used for an edge training, issues and challenges occur as deploying this distributed system.&lt;/p&gt;
+
+&lt;p&gt;The first challenge is &lt;strong&gt;struggling workers&lt;/strong&gt;.
+Training devices are heterogeneity, from limited IoT camera to high-end media center with powerful GPU.
+They are not designed to do machine learnings.
+So, a good edge-based distributed learning framework must can handle variety speeds in training tasks.&lt;/p&gt;
+
+&lt;p&gt;The second challenge is how to &lt;strong&gt;scale up&lt;/strong&gt; clusters.
+In a campus, thousands and more devices may contribute computing resources to the same training tasks.
+However, these devices may located in far not matter in physical or in network topology. 
+How can we well use them well, without struggled with endless transmission time remains a challenge.&lt;/p&gt;
+
+&lt;p&gt;The third issue is frequently &lt;strong&gt;joining and exiting&lt;/strong&gt; of devices.
+We can’t rely on each devices to faithfully working on training tasks rather than their original workload.
+Smartly schedule work balance and handle join/exit issues also need under consideration.&lt;/p&gt;
+
+&lt;h2 id=&quot;our-proposal&quot;&gt;Our proposal&lt;/h2&gt;
+
+&lt;ul&gt;
+  &lt;li&gt;
+    &lt;p&gt;Dynamic training data distribution and runtime profiler&lt;/p&gt;
+
+    &lt;p&gt;We design a dynamic training data distribution mechanism that helps to both the first and the third challenges.
+  Preprocessing data can be transmitted without leakage of raw sensitive information. 
+  This can helps with struggling workers who can train small batches in order to upload parameters with a similar training time.
+  Also, for extremely slow devices, join and exit of devices cases, dynamic data distribution and profiler can helps with keep global training parameters from polluted and staleness.&lt;/p&gt;
+
+    &lt;p&gt;To counter heterogeneity’s, more approaches were applied in our later research.
+  More details were introduced to runtime profiler in the later works.&lt;/p&gt;
+  &lt;/li&gt;
+  &lt;li&gt;
+    &lt;p&gt;Asynchronous and synchronous aggregation enabled&lt;/p&gt;
+
+    &lt;p&gt;In our findings, asynchronous and synchronous parameter update have their pros and cons. 
+  Keeping sync all the time leads struggling worker issue unsolvable.
+  However, async’s harm to accuracy and convergence time also need attentions.
+  To carefully chose between these two update policies at the runtime is what we proposed to make use of their own advantages.&lt;/p&gt;
+  &lt;/li&gt;
+  &lt;li&gt;
+    &lt;p&gt;Leader role splitting&lt;/p&gt;
+
+    &lt;p&gt;The idea is to let worker devices with higher bandwidth taking leader role during training.
+  Parameter updating does not require much computation but only need bandwidth. 
+  Devices with sufficient bandwidth can also work as virtual leader devices.
+  This approach helps with minimize physical devices we used and more leaders can further scale up workers limits.&lt;/p&gt;
+  &lt;/li&gt;
+&lt;/ul&gt;
+</description>
+        <pubDate>Wed, 13 Oct 2021 16:53:20 -0400</pubDate>
+        <link>https://codersherlock.github.com//archivers/eddl-how-do-we-train-on-limited-edge-devices</link>
+        <guid isPermaLink="true">https://codersherlock.github.com//archivers/eddl-how-do-we-train-on-limited-edge-devices</guid>
+        
+        
+        <category>Research</category>
+        
+      </item>
+    
      <item>
        <title>Generate Word Cloud Figures with Chinese-Tokenization and WordCloud python libraries</title>
        <description>&lt;p&gt;Let’s generate a word cloud like this. 
@@ -60,6 +60,8 @@
      
        
      
+        
+      
    </nav>
  </div>
 </header>
@@ -73,6 +75,24 @@

  <ul class="post-list">
    
+      <li>
+        <h2>
+          <a class="post-link" href="/archivers/eddl-how-do-we-train-on-limited-edge-devices">EDDL: How do we train neural networks on limited edge devices - PART 1</a>
+        </h2>
+        
+        <div class="post-meta">Oct 13, 2021</div>
+
+        <div class="post-excerpt">
+          <p>This post introduces our previous milestone in project “Edge trainer”, as the paper “EDDL: A Distributed Deep Learning System for Resource-limited Edge Computing Environment.” was published.
+As the first part of the introductions, I focus only on the motivation and summary of our works.
+More details in design and implementation can be found in late posts.</p>
+
+          <p>
+            <a class="post-link" href="/archivers/eddl-how-do-we-train-on-limited-edge-devices">Read More &raquo;</a>
+          </p>
+        </div>
+      </li>
+    
      <li>
        <h2>
          <a class="post-link" href="/archivers/generate-word-cloud-with-chinese-fenci">Generate Word Cloud Figures with Chinese-Tokenization and WordCloud python libraries</a>
@@ -144,22 +164,6 @@ My current solution is using AP to forward all SSL traffic to a proxy, <a href="
        </div>
      </li>
    
-      <li>
-        <h2>
-          <a class="post-link" href="/archivers/hello">Stop Talking is the worst title of one blog</a>
-        </h2>
-        
-        <div class="post-meta">Oct 26, 2016</div>
-
-        <div class="post-excerpt">
-          
-
-          <p>
-            <a class="post-link" href="/archivers/hello">Read More &raquo;</a>
-          </p>
-        </div>
-      </li>
-    
  </ul>
  
  <!-- Pagination links -->
@@ -167,9 +171,9 @@ My current solution is using AP to forward all SSL traffic to a proxy, <a href="
  
    <span class="previous">PREV</span>
  
-  <span class="page_number ">1 of 1</span>
+  <span class="page_number ">1 of 2</span>
  
-    <span class="next ">NEXT</span>
+    <a href="/page2" class="next">NEXT</a>
  
 </div>

@@ -196,6 +200,8 @@ My current solution is using AP to forward all SSL traffic to a proxy, <a href="
  <div class="col-box-title">Newest Posts</div>
  <ul class="post-list">
    
+      <li><a class="post-link" href="/archivers/eddl-how-do-we-train-on-limited-edge-devices">EDDL: How do we train neural networks on limited edge devices - PART 1</a></li>
+    
      <li><a class="post-link" href="/archivers/generate-word-cloud-with-chinese-fenci">Generate Word Cloud Figures with Chinese-Tokenization and WordCloud python libraries</a></li>
    
      <li><a class="post-link" href="/archivers/intro-xv6">Xv6 introduction</a></li>
@@ -204,8 +210,6 @@ My current solution is using AP to forward all SSL traffic to a proxy, <a href="
    
      <li><a class="post-link" href="/archivers/charles-is-not-a-good-tool">Using charles proxy to monitor mobile SSL traffics</a></li>
    
-      <li><a class="post-link" href="/archivers/hello">Stop Talking is the worst title of one blog</a></li>
-    
  </ul>
 </div>

@@ -0,0 +1,162 @@
+<!DOCTYPE html>
+<html>
+
+  <head>
+  <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
+  <meta name="viewport" content="width=device-width, initial-scale=1">
+
+  <title>Stop Talking, Start Doing</title>
+  <meta name="description" content="My personal blog, with some boring research staff and some tricks I was fancy to. I'll try my best to make this blog fun and useful. Not just a place I complain about all happens in my Lab.
+">
+
+  <link rel="stylesheet" href="/css/main.css">
+  <link rel="stylesheet" href="/css/timeline.css">
+  <link rel="canonical" href="https://codersherlock.github.com//page2/">
+  <link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Tangerine">
+  <link rel="alternate" type="application/rss+xml" title="Stop Talking, Start Doing" href="https://codersherlock.github.com//feed.xml" />
+  <script>
+  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
+        (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
+    m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
+          })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');
+
+  ga('create', 'UA-82637164-1', 'auto');
+    ga('send', 'pageview');
+
+  </script>
+  <script async src="//pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script>
+  <script>
+    (adsbygoogle = window.adsbygoogle || []).push({
+      google_ad_client: "ca-pub-6651321038908478",
+      enable_page_level_ads: true
+    });
+  </script>
+</head>
+
+
+  <body>
+
+    <header class="header">
+  <div class="wrapper">
+    <a class="site-title" href="/">Stop Talking, Start Doing</a>
+    <nav class="site-nav">
+      
+        
+      
+        
+        <a class="page-link" href="/about/">About</a>
+        
+      
+        
+        <a class="page-link" href="/category/">Category</a>
+        
+      
+        
+      
+        
+      
+        
+      
+        
+      
+        
+      
+        
+      
+    </nav>
+  </div>
+</header>
+
+    <div class="page-content">
+      <div class="wrapper">
+        <div class="col-main">
+          <div class="home">
+  <a class="rss-link" href="/feed.xml">RSS Feed</a>
+  <h1 class="page-heading">Articles</h1>
+
+  <ul class="post-list">
+    
+      <li>
+        <h2>
+          <a class="post-link" href="/archivers/hello">Stop Talking is the worst title of one blog</a>
+        </h2>
+        
+        <div class="post-meta">Oct 26, 2016</div>
+
+        <div class="post-excerpt">
+          
+
+          <p>
+            <a class="post-link" href="/archivers/hello">Read More &raquo;</a>
+          </p>
+        </div>
+      </li>
+    
+  </ul>
+  
+  <!-- Pagination links -->
+<div class="pagination">
+  
+    <a href="/" class="previous">PREV</a>
+  
+  <span class="page_number ">2 of 2</span>
+  
+    <span class="next ">NEXT</span>
+  
+</div>
+
+</div>
+
+        </div>
+        <div class="col-second">
+          <div class="col-box col-box-author">
+  <img class="avatar" src="/static/avatar.jpg" alt="Pengzhan Hao">
+  <div class="col-box-title name">Pengzhan Hao</div>
+  <p></p>
+  <p class="contact">
+    
+    <a href="https://github.com/codersherlock">GitHub</a>
+    
+    
+    
+    <a href="mailto:haopengzhan@gmail.com">Email</a>
+    
+  </p>
+</div>
+
+<div class="col-box">
+  <div class="col-box-title">Newest Posts</div>
+  <ul class="post-list">
+    
+      <li><a class="post-link" href="/archivers/eddl-how-do-we-train-on-limited-edge-devices">EDDL: How do we train neural networks on limited edge devices - PART 1</a></li>
+    
+      <li><a class="post-link" href="/archivers/generate-word-cloud-with-chinese-fenci">Generate Word Cloud Figures with Chinese-Tokenization and WordCloud python libraries</a></li>
+    
+      <li><a class="post-link" href="/archivers/intro-xv6">Xv6 introduction</a></li>
+    
+      <li><a class="post-link" href="/archivers/some-of-my-previews-exper-work">Some of my previews experiment works: 2016</a></li>
+    
+      <li><a class="post-link" href="/archivers/charles-is-not-a-good-tool">Using charles proxy to monitor mobile SSL traffics</a></li>
+    
+  </ul>
+</div>
+
+<div class="col-box post-toc hide">
+  <div class="col-box-title">Indexes</div>
+</div>
+        </div>
+      </div>
+    </div>
+
+    <footer class="footer">
+<div class="wrapper">
+&copy; 2016 Pengzhan Hao
+</div>
+</footer>
+
+<script src="/js/easybook.js"></script>
+
+  </body>
+
+</html>
@@ -62,6 +62,8 @@
      
        
      
+        
+      
    </nav>
  </div>
 </header>
@@ -208,6 +210,8 @@
  <div class="col-box-title">Newest Posts</div>
  <ul class="post-list">
    
+      <li><a class="post-link" href="/archivers/eddl-how-do-we-train-on-limited-edge-devices">EDDL: How do we train neural networks on limited edge devices - PART 1</a></li>
+    
      <li><a class="post-link" href="/archivers/generate-word-cloud-with-chinese-fenci">Generate Word Cloud Figures with Chinese-Tokenization and WordCloud python libraries</a></li>
    
      <li><a class="post-link" href="/archivers/intro-xv6">Xv6 introduction</a></li>
@@ -216,8 +220,6 @@
    
      <li><a class="post-link" href="/archivers/charles-is-not-a-good-tool">Using charles proxy to monitor mobile SSL traffics</a></li>
    
-      <li><a class="post-link" href="/archivers/hello">Stop Talking is the worst title of one blog</a></li>
-    
  </ul>
 </div>