[Fiware-lab-federation-nodes] [CESNET #134122] Re: experiences with HA

Giuseppe Cossu giuseppe.cossu at create-net.org
Tue Nov 17 14:25:13 CET 2015


Hi all,
I want to share with you this link that lists the deployment scenario of
Neutron: http://docs.openstack.org/networking-guide/deploy.html
As I said the main problems using HA in OpenStack were related to Neutron,
that's because the L3 agent was configured in active/passive and it was
actually not ready to be really in HA. For that reason the OpenStack
community has developed the DVR (introduced on Juno) that - on paper -
solves many issues related to Neutron. For sure it overcomes many Neutron
architecture limitation (performance, scalability, bottleneck of the
networking node).

I can confirm from my direct experience that Juno with legacy L3 agent is
quite stable in a production environment.
Regarding Kilo I would suggest to use DVR - but - as Fanis stated, there
could be some unexpected issues... so it is up the the IOwner select the
wise thing to do.

NOTE: using Fuel 7.0 you don't have the possibility to choose between
with-HA/without-HA. It deploys an HA environment, so using FUEL you have to
manage the Corosync/Pacemaker cluster. That means that also Neutron is
installed in HA.
FUEL 7.0 have an additional option regarding the Neutron installation: you
can choose to use or not DVR (if you not select DVR, the legacy L3 agent is
used).

Regarding the OpenStack architecture and procedures using HA, Mirantis
offers a very useful documentation
https://docs.mirantis.com/openstack/fuel/fuel-7.0/#guides . In particular
regarding the HA:
https://docs.mirantis.com/openstack/fuel/fuel-7.0/operations.html and
https://docs.mirantis.com/openstack/fuel/fuel-7.0/reference-architecture.html#multi-node-with-ha-deployment

Regards,
Giuseppe


On Tue, Nov 17, 2015 at 1:17 PM, Sean Murphy <murp at zhaw.ch> wrote:

> Hi again all,
>
> To follow up on this after the discussion on the confcall this morning
> (which
> I found v useful - it might be good if we have more discussion of these
> important issues on the calls from time to time).
>
> It was not clear to me the status of the Spanish node: I did not concretely
> understand what Fernando said regarding HA. From previous communication,
> I understand that they chose not to use HA in Juno; in the meetings of the
> minutes from today, I see
>
> "Migrated to Kilo, pending swift migration (waiting help from IBM)"
>
> @Fernando - can you tell us if you went with HA in Kilo?
>
> BR,
> Seán.
>
>
> On Mon, Nov 16, 2015 at 9:27 AM, Murphy Seán (murp) <murp at zhaw.ch> wrote:
>
>> Hi Fede, all,
>>
>> juno HA is quite stable in our experience. the problems are always
>>> related to the neutron when you restart a
>>>
>>
>> Good to hear.
>>
>>
>>> node. so rule number one, if you need to restart, use corosynch to call
>>> out your node. this will do a graceful re-balancing among l3 agents. in
>>> case of sudden "death" of the node, the problem is not much in that, but
>>> when you re-attach the node. also in this case correct management of
>>> corosynch is the trick.
>>>
>>
>> Thanks for the pointers - I may ask for more info on the confcall as I
>> don't fully
>> get the point here. Also, it would be good to know if this also applies
>> to Kilo.
>>
>>
>>> In case you have not noticed, following the new dow in FI-CORE and the
>>> Open Call, requirements on SLA and availability are quite strict, so if
>>> your node dies because the only controller you have is un-recoverable, and
>>> because of that you breach the required availability threshold, this may
>>> have financial implications for FI-CORE nodes.
>>>
>>
>> Thanks for pointing that out. I guess everyone has a strong interest in
>> having the
>> systems as reliable as possible - unreliable systems give lots of
>> headaches. I guess
>> what I was interested in knowing is whether HA is likely to make the
>> system more
>> reliable or less reliable: the experience in XiFi was that it seemed to
>> make things
>> less reliable.
>>
>> BR,
>> Seán.
>>
>>
>>
>>> Br,
>>> Federico
>>>
>>> --
>>> Future Internet is closer than you think!
>>> http://www.fiware.org
>>>
>>> Official Mirantis partner for OpenStack Training
>>> https://www.create-net.org/community/openstack-training
>>>
>>> --
>>> Dr. Federico M. Facca
>>>
>>> CREATE-NET
>>> Via alla Cascata 56/D
>>> 38123 Povo Trento (Italy)
>>>
>>> P  +39 0461 312471
>>> M +39 334 6049758
>>> E  federico.facca at create-net.org
>>> T @chicco785
>>> W  www.create-net.org
>>>
>>> On Fri, Nov 13, 2015 at 11:54 AM, Theofanis Katsiaounis <
>>> th_katsiaounis at neuropublic.gr> wrote:
>>>
>>>> Hi all,
>>>> Indeed Kilo could solve the network issues since networking is HA
>>>> capable too.
>>>> Containers/Swift can be a problem especially since you have to leave
>>>> space to create the storage rings etc.
>>>>
>>>> Regards,
>>>> Fanis
>>>>
>>>> On 13/11/2015 12:50 μμ, Cristian Cristelotti wrote:
>>>> > Hi Sean,
>>>> >
>>>> > Our experience with Grizzly (HA) was very bad. IceHouse (HA) was
>>>> better but not stable . Now we are with JUNO on single-node and we haven't
>>>> faced any problem .
>>>> > We are working on the migration to KILO (HA + murano + ceilometer ).
>>>> >
>>>> > KILO seems to have solved the problems mentioned by Fanis.
>>>> > If you'll not deploy the node with HA you'll not have containers
>>>> functionality or better you have to install swift manually after fuel
>>>> deployment.
>>>> >
>>>> >
>>>> >
>>>> > Regards
>>>> >
>>>> > Cristian
>>>> >
>>>> > ----- Messaggio originale -----
>>>> > Da: "Sean Murphy" <murp at zhaw.ch>
>>>> > A: "Theofanis Katsiaounis" <th_katsiaounis at neuropublic.gr>
>>>> > Cc: fiware-lab-federation-nodes at lists.fiware.org
>>>> > Inviato: Venerdì, 13 novembre 2015 11:40:13
>>>> > Oggetto: Re: [Fiware-lab-federation-nodes] [CESNET #134122] Re:
>>>> experiences with HA
>>>> >
>>>> >
>>>> >
>>>> > Hi all,
>>>> >
>>>> >
>>>> > So the feedback so far is the following:
>>>> > - Riwal says that running Juno/HA is not so problematic, but has not
>>>> had a specific failure
>>>> > situation where HA could really be tested
>>>> > - Fernando notes that Juno/HA exhibited stability problems for larger
>>>> numbers of users and
>>>> > decided against it
>>>> > - Fanis notes that Icehouse/HA was quite problematic in multiple
>>>> respects
>>>> >
>>>> >
>>>> > >From our pov, this is not painting a v positive picture regarding HA
>>>> and despite
>>>> > our inclination to experiment with newer technologies we would prob
>>>> opt not to
>>>> > use HA.
>>>> >
>>>> >
>>>> > Does anyone in the project have Kilo/HA experience?
>>>> >
>>>> >
>>>> > BR,
>>>> > Seán.
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On Fri, Nov 13, 2015 at 10:38 AM, Theofanis Katsiaounis <
>>>> th_katsiaounis at neuropublic.gr > wrote:
>>>> >
>>>> >
>>>> >
>>>> > Hi all,
>>>> > we had HA on Icehouse and it was a mess. Especially with the
>>>> Networking/Neutron part. Namespaces were not transfered between nodes so if
>>>> one went down vm's lost networking. Reboots were a lottery indeed,
>>>> sometimes they worked sometimes they did not. And when we lost power once i
>>>> had to rebuild the node.
>>>> > Of course the FIWARE lab handbook asks for an HA solution but i see
>>>> in the case of Spain this has already been violated ;).
>>>> > My two cents is that the guys from Spain made the right choice. I do
>>>> not think HA in openstack is ready for production especially with a big
>>>> number of users.
>>>> >
>>>> > Regards,
>>>> > Fanis
>>>> >
>>>> >
>>>> > On 13/11/2015 11:33 πμ, Riwal KERHERVE wrote:
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > Sean,
>>>> >
>>>> >
>>>> >
>>>> > In Grizzly, anytime we needed to restart processes handled by CRM, it
>>>> was a lottery. Sometimes, everything went fine and sometimes the processes
>>>> keep on rebooting and it take us hours to put back things in order.
>>>> >
>>>> > In Juno, we never experienced this kind of behavior. When we needed
>>>> to restart processes trough CRM, all always went fine.
>>>> >
>>>> >
>>>> >
>>>> > To answer to your question:
>>>> >
>>>> > The only time, we played with HA, it was to take into account some
>>>> modification in our configuration files. I do not recall exercising HA
>>>> capabilities, like the need of putting one node down and switching all
>>>> processes to the other node.
>>>> >
>>>> >
>>>> >
>>>> > BR
>>>> >
>>>> > Riwal
>>>> >
>>>> >
>>>> >
>>>> > De : sean at gopaddy.ch [ mailto:sean at gopaddy.ch ] De la part de Sean
>>>> Murphy
>>>> > Envoyé : jeudi 12 novembre 2015 17:01
>>>> > À : Riwal KERHERVE
>>>> > Cc : fiware-lab-federation-nodes at lists.fiware.org
>>>> > Objet : Re: [CESNET #134122] Re: [Fiware-lab-federation-nodes]
>>>> experiences with HA
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > Hi Riwal,
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > Good feedback - thanks for that.
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > As a matter of interest, have you ever needed to exercise any of the
>>>> HA
>>>> >
>>>> >
>>>> > capabilities or have you tested it in anger?
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > BR,
>>>> >
>>>> >
>>>> > Seán.
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On Thu, Nov 12, 2015 at 4:51 PM, Riwal KERHERVE via RT < xifi-support@
>>>> rt.cesnet.cz > wrote:
>>>> >
>>>> > Sean,
>>>> >
>>>> > I do not have experience with Kilo in HA, but our node is in Juno and
>>>> in HA. We installed it with fuel 6.0 (2 controllers and 1 Arbitrator)
>>>> . We never have any trouble until now: very stable, nothing to be with
>>>> HA in grizzly.
>>>> >
>>>> > BR
>>>> > Riwal
>>>> >
>>>> > De : fiware-lab-federation-nodes-bounces at lists.fiware.org [mailto:
>>>> fiware-lab-federation-nodes-bounces at lists.fiware.org ] De la part de
>>>> Sean Murphy
>>>> > Envoyé : jeudi 12 novembre 2015 16:33
>>>> > À : fiware-lab-federation-nodes at lists.fiware.org
>>>> > Objet : [Fiware-lab-federation-nodes] experiences with HA
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > Hi all,
>>>> >
>>>> > We're looking at our upgrade strategy and we're curious to
>>>> > hear any experience with Kilo HA both from the deployment
>>>> > perspective as well as the operations perspective.
>>>> >
>>>> > >From xifi, I remember Fanis reporting a split-brain scenario
>>>> > with HA and in the end he opted not to go with a HA solution;
>>>> > this gives me pause for thought when considering this
>>>> > deployment solution, even though it seems to be the
>>>> > preferred solution.
>>>> >
>>>> > Generally, we would be well disposed to a HA deployment
>>>> > as we would like to learn about it, but we do not want to
>>>> > end up deploying a technology that is too far from production
>>>> > readiness.
>>>> >
>>>> > Does anyone have any experience that they can share on this
>>>> > point?
>>>> >
>>>> > BR,
>>>> > Seán.
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > _______________________________________________
>>>> > Fiware-lab-federation-nodes mailing list Fiware-lab-federation-nodes@
>>>> lists.fiware.org
>>>> https://lists.fiware.org/listinfo/fiware-lab-federation-nodes
>>>> >
>>>> >
>>>> > Αποποίηση ευθυνών / Disclaimer
>>>> >
>>>> >
>>>> > _______________________________________________
>>>> > Fiware-lab-federation-nodes mailing list
>>>> > Fiware-lab-federation-nodes at lists.fiware.org
>>>> > https://lists.fiware.org/listinfo/fiware-lab-federation-nodes
>>>> >
>>>>
>>>>
>>>>
>>>> *Αποποίηση ευθυνών / Disclaimer*
>>>> <http://www.neuropublic.gr/el/disclaimer>
>>>>
>>>> _______________________________________________
>>>> Fiware-lab-federation-nodes mailing list
>>>> Fiware-lab-federation-nodes at lists.fiware.org
>>>> https://lists.fiware.org/listinfo/fiware-lab-federation-nodes
>>>>
>>>>
>>>
>>
>
> _______________________________________________
> Fiware-lab-federation-nodes mailing list
> Fiware-lab-federation-nodes at lists.fiware.org
> https://lists.fiware.org/listinfo/fiware-lab-federation-nodes
>
>


-- 
--------------------------------------------------------
Giuseppe Cossu
CREATE-NET
Smart Infrastructures
Research Engineer
Via alla Cascata 56/D - 38123 Povo Trento (Italy)
e-mail: giuseppe.cossu at create-net.org
Tel: (+39) 0461312428
www.create-net.org
--------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.fiware.org/private/fiware-lab-federation-nodes/attachments/20151117/fd57f9a8/attachment.html>


More information about the Fiware-lab-federation-nodes mailing list

You can get more information about our cookies and privacy policies clicking on the following links: Privacy policy   Cookies policy