metron-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (METRON-389) Create Java API to Read Profile Data During Model Scoring
Date Mon, 12 Sep 2016 13:26:20 GMT

    [ https://issues.apache.org/jira/browse/METRON-389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15484114#comment-15484114
] 

ASF GitHub Bot commented on METRON-389:
---------------------------------------

Github user nickwallen commented on a diff in the pull request:

    https://github.com/apache/incubator-metron/pull/236#discussion_r78371167
  
    --- Diff: metron-analytics/metron-profiler/src/main/java/org/apache/metron/profiler/hbase/SaltyRowKeyBuilder.java
---
    @@ -0,0 +1,211 @@
    +/*
    + *
    + *  Licensed to the Apache Software Foundation (ASF) under one
    + *  or more contributor license agreements.  See the NOTICE file
    + *  distributed with this work for additional information
    + *  regarding copyright ownership.  The ASF licenses this file
    + *  to you under the Apache License, Version 2.0 (the
    + *  "License"); you may not use this file except in compliance
    + *  with the License.  You may obtain a copy of the License at
    + *
    + *      http://www.apache.org/licenses/LICENSE-2.0
    + *
    + *  Unless required by applicable law or agreed to in writing, software
    + *  distributed under the License is distributed on an "AS IS" BASIS,
    + *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + *  See the License for the specific language governing permissions and
    + *  limitations under the License.
    + *
    + */
    +
    +package org.apache.metron.profiler.hbase;
    +
    +import org.apache.hadoop.hbase.util.Bytes;
    +import org.apache.metron.profiler.ProfileMeasurement;
    +import org.apache.metron.profiler.ProfilePeriod;
    +
    +import java.nio.ByteBuffer;
    +import java.security.MessageDigest;
    +import java.security.NoSuchAlgorithmException;
    +import java.util.ArrayList;
    +import java.util.List;
    +import java.util.concurrent.TimeUnit;
    +
    +/**
    + * A RowKeyBuilder that uses a salt to prevent hot-spotting.
    + *
    + * Responsible for building the row keys used to store profile data in HBase.  The row
key is composed of the following
    + * fields in the given order.
    + * <ul>
    + * <li>salt - A salt that helps prevent hot-spotting.
    + * <li>profile - The name of the profile.
    + * <li>entity - The name of the entity being profiled.
    + * <li>group(s) - The group(s) used to sort the data in HBase. For example, a group
may distinguish between weekends and weekdays.
    + * <li>year - The year based on UTC.
    + * <li>day of year - The current day within the year based on UTC; [1, 366]
    + * <li>hour - The hour within the day based on UTC; [0, 23]
    + * </ul>period - The period within the hour.  The number of periods per hour can
be defined by the user; defaults to 4.
    + */
    +public class SaltyRowKeyBuilder implements RowKeyBuilder {
    +
    +  /**
    +   * A salt can be prepended to the row key to help prevent hot-spotting.  The salt
    +   * divisor is used to generate the salt.  The salt divisor should be roughly equal
    +   * to the number of nodes in the Hbase cluster.
    +   */
    +  private int saltDivisor;
    +
    +  /**
    +   * An hour is divided into multiple periods.  This defines how many periods
    +   * will exist within each given hour.
    +   */
    +  private int periodsPerHour;
    +
    +  public SaltyRowKeyBuilder() {
    +    this.saltDivisor = 1000;
    +    this.periodsPerHour = 4;
    +  }
    +
    +  public SaltyRowKeyBuilder(int saltDivisor, int periodsPerHour) {
    +    this.saltDivisor = saltDivisor;
    +    this.periodsPerHour = periodsPerHour;
    +  }
    +
    +  /**
    +   * Builds a list of row keys necessary to retrieve profile measurements over
    +   * a time horizon.
    +   *
    +   * @param profile The name of the profile.
    +   * @param entity The name of the entity.
    +   * @param groups The group(s) used to sort the profile data.
    +   * @param durationAgo How long ago?
    +   * @param unit The time units of how long ago.
    +   * @return All of the row keys necessary to retrieve the profile measurements.
    +   */
    +  @Override
    +  public List<byte[]> rowKeys(String profile, String entity, List<Object>
groups, long durationAgo, TimeUnit unit) {
    --- End diff --
    
    Just because it matches the `ProfilerClient` interface.  Could easily be changed to start/end.
 What are you thinking the advantages of doing so are?  Easier to test?


> Create Java API to Read Profile Data During Model Scoring
> ---------------------------------------------------------
>
>                 Key: METRON-389
>                 URL: https://issues.apache.org/jira/browse/METRON-389
>             Project: Metron
>          Issue Type: Improvement
>            Reporter: Nick Allen
>            Assignee: Nick Allen
>              Labels: profiler
>
> Functionality already exists to create Profiles in HBase.  A Java API is needed to read
existing Profile data from HBase.  The API should be optimized for use when scoring models.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message