January 11, 2019

Happy to help you with Big Data and/or Machine Learning, Deep Learning techniques for your project.

Hello everybody,

How many of you leverage the data of your website?

Just looking at the data of your customers into your website, if properly manipulated, can provide extremely useful insights into what they like or don't at your web service.

AI is a hyped term, but its potential is real. Don't think of AI only as rocket science applied to robotics or demotic house.

Techniques such as advanced data analytics, Machine Learning and the new frontier of Deep Learning, are crucial to help your business make data-informed decisions, improve customer engagement (i.e. through recommendation systems) and customer experience throughout the journey in your site.

If you would like to discuss anything here, just leave a comment below.

Thanks :).

  1. 1

    I could use some help. We need pattern recognition. https://www.tensorflow.org

    1. 1

      What is the problem you want to solve? What are your data? What is the business you operate in? Just be more specific, here or if you want I can drop you my email.

  2. 1

    If anybody needs support for Big Data, Machine Learning and Deep Learning, they can just contact me for a personalized experience, and I'll be happy to help for FREE at the beginning.

  3. 1

    Sorry to divert the topic.

    If anyone wants to hire a technical AI ML advisor from people who implemented large scaled ML system for mentoring or consulting, please visit my website: www.maester.ai

    Good luck!

  4. 1

    What is/are the basic most sort-out for language you can advise someone who wants to go into machine learning to know

    1. 1

      As I told before, start with Python.

      It's easy to use, extremely flexible, there is a package to do almost everything in python, and, last but not least, has a large community of users.

  5. 1

    What is your preferred technology stack?

    1. 1

      Definitely Python, as it is so flexible, easy to use, and has a great community.

      In addition to that, scikit-learn for ML, and pytorch + less of TF for DL.

      1. 1

        Thanks for your response. What are your thoughts on tensorflow.js? Have you had the chance to form an opinion?

        1. 1

          Tensorflow.js is a good tool, especially to implement the weights of the trained model into production.

          1. 1

            Thank you for your feedback, I am looking to get into it as well.

            1. 1

              You're welcome, man ;)

  6. 1


    I'm would like to implement a tagging algorithm for https://careerin.tech. I have a list of companies with description and a list of jobs with a job title and job description. I have a list of categories for jobs, e.g. (programming language, framework, etc.).

    So far I don't have a training set, e.g. hand tagged jobs or companies.

    I would like to implement a simple solutions with good accuracy in javascript. Should I just use something like bayes filter filter? What's your suggestion?

    Thanks for your help.

    Best Philipp

    1. 2

      Ok, you'd need a classification algorithm.

      In order to do that, a training data set is needed. There might be similar labeled datasets freely available on line.

      Once you get that, you need to train a specific algorithm to classify job descriptions into tags. There are different ones, but I would use Deep Learning for that.

      1. 1

        Yes, I guess I need some data for training and should do the tagging by hand. I'm just a little bit lazy right now...

        I would like to keep it simple. I'm not familiar with deep learning (and ml in general). Is deep learning easy to implement?

        My first take on "classification" was keyword lookup in the description text. But if you have categories like "r" (the programming language) things start to fall apart.

        I'm interested in a algorithm that detects street addresses in text too. I wrote a robust solution for addresses in my town based on regexs. But the solution only works for my town because I know all zip codes and street names in the town.

        Thanks for your time!

        1. 1

          Is deep learning easy to implement?

          It's not difficult, though not so easy.

          My first take on "classification" was keyword lookup in the description text. But if you have categories like "r" (the programming language) things start to fall apart.

          Deep L for text classification is the definitely best. How long job descriptions are though?

          I'm interested in a algorithm that detects street addresses in text too.

          Do you mean to extract the address? If not, I am not sure what do you mean here. Can you explain it better please.

          You're welcome, anyway :)!

          1. 1

            The average job description is 3149.284688995215 characters long (includes some html markup, can be removed).

            Problem description: Extract a street address, e.g. Hauptstraße 1 42 12345 City from text (text extracted from website).

            I think the address extraction problem for all street addresses in the world is quite complex...

            1. 1

              On average, how many words are in 3100 characters?

              Yes, extracting street addresses could an issue. I think it can be solved by DL, but have never thought of it, actually.

              Try looking at google, as they deal with addresses, ad maybe have a solution for you.

              Otherwise, I can try to get you somebody to do that manually, if interested. How many items do you have?

              1. 1

                On average there should be 373.488038277512 words in a description.

                I will classify the trainings set by hand at some point. thx.

                1. 1

                  Wow, the number of words is enough to get the job position tag.

                  If you don't have hundreds to thousands of job positions, it doesn't make sense to develop an algorithm, for sure.

                  Good luck ;).

      1. 1

        I prefer a solution where I know the algorithm and where I can change the algorithm to my liking.

    2. 1

      You can build your training set via MechanicalTurk or SurveyMonkey. It will cost you money but at least you won't be hand tagging all those yourself. However, I suggest that if it's a doable job, do it yourself because you'll learn how things work really quickly.

      I'm not familiar with Java libraries so I can't make a suggestion there :(

      1. 1

        I tried mturk for tagging user feedback for a consumer tech product (don't want my corporate affliations here) and it was very unsuccessful (pre-determined labels like crashes, slowness, plugin issue, etc with descriptions). I was hoping that running the same feedback by 3-5 turks would result in good plurality tags but was unsuccessful. Perhaps you've had better luck. My stupid regex parsing had a much better hit rate. That said I'm not a mturk or ML specialist just hacking with some vague knowledge from by college days.

        1. 1

          I actually have a few questions about your experience using MTurk. It's a project idea. Could find my email in my profile and send me a message? I'd love to connect. Unless there's a DM feature on IH?

      2. 1

        Yes, I think it's doable but I'm too lazy right now :)

        I use JavaScript not Java. There are NLP toolkits for javascript.

    3. 1

      This comment was deleted 5 days ago.

  7. 1

    I absolutely agree. With the capabilities we have now to dig into the data there is so much more insight that can be made. I'm actually working on my next product that will be doing exactly this.

    Digging into the data and getting into the math to provide forecasts and predictive analysis it's amazing what's available. So often people are guessing if they do x, then y will happen. There are no perfect crystal balls, but if you can at least have a decent idea of what actually will happen when you do x, why wouldn't you (assuming you don't make the wrong 'reading' :-)).

    @batlle500 I'd like to hear what you've been digging into and how you've been going about doing it.

    1. 1

      I am happy you share my view, as too many businesses (even big ones) until last year did not understand the values of data. Now, things are changing.

      @QuaffAPint, what do I do?

      Well, I am an economist with a PhD in agricultural statistics, and teach Big Data for Social Sciences at Bologna University (Italy). Working with data and statistical modelling is my bread and butter. I am also a FAO consultant.

      In addition to Stata, over the last 5 years I have been coding in Python, with a focus on applied Machine Learning and Deep Learning, ranging from NLP to recommendation systems to tabular data and images.

      It's a broad field and I try to work on as many different stuff as it is possible :). To be more precise, I basically build models (train ML and DL) to get insights about various phenomenons.

      My personal mission is to spread the application of these advanced techniques beyond big tech companies.

      What about you?

      1. 1

        Wow, I would love to take your classes (I wanted to be in agriculture and combining that in with big data is fascinating) :-). I've been working with R and I'm amazed at some of the awesome packages it provides. I've been looking at coming up with ways to forecast website data in a general enough way to provide forecasts for the masses that still holds value.

        The big thing I want to do is more predictive, if they do x, then here's what it might look like. If instead they did y, here's what it would be. I'm still working on the best way to provide that in a way that the 'prediction' is of value. I have training data for each time period. I can then forecast out, but I'm more thinking the best way to see if they increase x what will happen, as opposed to just forecasting with the current data.

        1. 1

          Cool stuff, though a bit too generic. Specifically, it is extremely difficult to get one solution to all problems. This is just a suggestion :).

          However, for your kind of problems (if I am not wrong) Bayesian Statistics have been long used, though in chess games or poker, deep learning is unbeatable.

          Lately, Deep Mind, a google lab, has developed an self-taught algorithm to play chess based on Deep Learning. Its performance is amazing.

  8. 1

    Do you really believe that Deep Learning is crucial to help a business make data-informed decisions?

    1. 1

      I'm going to add my two-cents to this coming from a current world data science perspective.

      My short answer is no. Why? Two reasons I can think of now.. First, any decision made by a deep learning algorithm at the moment is pretty much a blackbox. Meaning that you can give your clients/stakeholders an answer, but it'll be extremely hard to tell them the "why" you came to that answer. Although this is the current state of deep learning models, this is quickly changing and hopefully these blackboxes become greyboxes soon. Second, from what I've seen in the market, companies are using deep learning algorithms to approach REALLY SIMPLE problems (e.g., prediction of sales in different regions). A simple regression may be able to answer 95% of this question-- and at the end of the day, a business only needs to know whether to execute or not (agree to disagree), so we aren't shooting for a 99.9% accurate model.

      Anyways, those are just my thoughts.

      1. 1

        Let me politely disagree with you.

        First, taking data-informed decisions does not necessarily mean knowing what is going inside blackbox. It can simply be to predict future sales, to optimize storage or to know which segment of the market wants what. For all these decisions (and I can mention o lot more), knowing what is going to happen is way more important than knowing why it is going to happen.

        Second, regression are only good at predicting linear relationships based on a lot of assumptions. Despite the wide utilization of regression, most of real world problems does not follow linearity. That is why Deep Learning is far better.

    2. 1

      That's the point. Most people think of the new technologies/techniques of being suitable only for big companies. It reminds me of companies skeptical of going online years back when internet was just introduced to the masses. And we all know how it ended.

      Turning back to your question, deep learning is the last frontier of techniques (some call them technologies) of making data-informed decisions. Depending on each case (of course), right now, using advanced data analytics with data at one's business availability could be enough. But, I am convinced it will not be like that in the next 1-2 years.

      Deep learning has many practical applications other than image recognition. Building Recommendation Systems, Natural languages processing, or performing predictive analysis of tabular data all need Deep Learning.

      To conclude, if for now, the differences in performance of DL may not be so striking with respect to other Machine Learning algorithms (i.e. XGBoost), consider the pace of advancement in this field of study over the last year, the sooner businesses will get into it the better.

      Tomorrow will be too late. :)

      1. 1

        How would a small to medium sized business, say without a dedicated IT team, be able to leverage Deep Learning? Like practically, how would they set it up and use it?

        1. 1

          There are many ways. However, it depends on the business and what you want to leverage Deep Learning for.


          • you first define an objective of how you can leverage your data (depending on data you have already gathered or will do so in the future), and/or maybe by exploiting other publicly available datasets (this last piece is called Transfer Learning).

          • secondly, you partner with a data scientist, and train a deep learning model, to optimize a specific task.

          -thirdly, one can simple incorporate the results of the trained model into your web site, just as you do with ip addresses or other stuff to personalize user's experience. In alternative, you can use deep learning results just to take informed decisions for new services/products.

          • as times goes by and you customers come in, or other business conditions change, re-train deep learning models again incorporating recent data.

          A good example is to prediction of sales in different geographical areas, for various age categories and so on.

          Is it clear :)?

          1. 1

            I suppose my point of this - leveraging deep learning right now requires partnering or employing a data scientist - a skill set which is hard to come by, and is very expensive. So with the model as it is now, we're talking a six figure investment, which in my opinion is out of the question for most small to medium businesses.

            There is a space opening up though through Microsoft and startups like prowler.io that are investing in platforms that will mean engaging with the technology much more mainstream.

            Judging by the rate that both those companies are hiring talent, I suspect it won't be cheap though.

            Good luck to you.

            1. 1

              This problem is exactly what my team is thinking of solving. We want to bring the quality and benefits of data science at an affordable price. Even made a post about it but got no response, so either people are not interested or do not know how data science can benefit them. Anyway, may I know what problem do you want to solve using data science?


            2. 1

              You don't need to build large platforms like that.

              High quality results could be achieved by starting with a pilot project before deciding to go all into building a platform.

              I can do that for five figures :).

              Depending on each case, I can train Deep Learning models which serve extremely well small to medium sized companies in terms of accuracy and flexibility.

              Good luck to you as well.

              1. 1

                What I mean is, they are creating a platform to allow companies to harness such technologies- so plug your own data into their platform and get results out the other side. Really interesting concept

                1. 1

                  Yes, that is amazing and is going to happen in the near future.