whimsical-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From s...@apache.org
Subject [whimsy] branch master updated: Scan for trademarks and copyright mentions
Date Thu, 27 Apr 2017 12:23:01 GMT
This is an automated email from the ASF dual-hosted git repository.

sebb pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/whimsy.git

The following commit(s) were added to refs/heads/master by this push:
       new  ee1ee7f   Scan for trademarks and copyright mentions
ee1ee7f is described below

commit ee1ee7f653588b18647a223181e94e10e877c4a3
Author: Sebb <sebb@apache.org>
AuthorDate: Thu Apr 27 13:22:59 2017 +0100

    Scan for trademarks and copyright mentions
---
 tools/site-scan.rb | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/tools/site-scan.rb b/tools/site-scan.rb
index beb2518..6345c7b 100755
--- a/tools/site-scan.rb
+++ b/tools/site-scan.rb
@@ -41,6 +41,8 @@ def parse(site, name)
     license: nil,
     sponsorship: nil,
     security: nil,
+    trademarks: nil,
+    copyright: nil,
   }
 
   # scan each link
@@ -82,6 +84,17 @@ def parse(site, name)
       data[:sponsorship] = uri + a['href'].strip
     end
   end
+  doc.traverse do |node|
+    next unless node.is_a?(Nokogiri::XML::Text)
+    # scrub is needed as some sites have invalid UTF-8 bytes
+    txt = node.text.scrub.gsub(/\s+/, ' ').strip
+    if txt =~ /trademarks of [Tt]he Apache Software Foundation/
+      data[:trademarks] = txt
+    end
+    if txt =~ /Copyright .+ [Tt]he Apache Software Foundation/
+      data[:copyright] = txt
+    end
+  end
   return data
 end
 

-- 
To stop receiving notification emails like this one, please contact
['"commits@whimsical.apache.org" <commits@whimsical.apache.org>'].

Mime
View raw message